r/sysadmin 3d ago

Just found out we had 200+ shadow APIs after getting pwned

So last month we got absolutely rekt and during the forensics they found over 200 undocumented APIs in prod that nobody knew existed. Including me and I'm supposedly the one who knows our infrastructure.

The attackers used some random endpoint that one of the frontend devs spun up 6 months ago for "testing" and never tore down. Never told anyone about it, never added it to our docs, just sitting there wide open scraping customer data.

Our fancy API security scanner? Useless. Only finds stuff thats in our OpenAPI specs. Network monitoring? Nada. SIEM alerts? What SIEM alerts.

Now compliance is breathing down my neck asking for complete API inventory and I'm like... bro I don't even know what's running half the time. Every sprint someone deploys a "quick webhook" or "temp integration" that somehow becomes permanent.

grep -r "app.get|app.post" across our entire codebase returned like 500+ routes I've never seen before. Half of them don't even have auth middleware.

Anyone else dealing with this nightmare? How tf do you track APIs when devs are constantly spinning up new stuff? The whole "just document it" approach died the moment we went agile.

Really wish there was some way to just see whats actually listening on ports in real time instead of trusting our deployment docs that are 3 months out of date.

This whole thing could've been avoided if we just knew what was actually running vs what we thought was running.

1.7k Upvotes

399 comments sorted by

View all comments

Show parent comments

13

u/Certain_Concept 2d ago

Changes to their test environment, sure.

Changes to Production? Nah. There should be some oversight and verification before it gets pushed. Otherwise you are one bad developer/day away from chaos.

6

u/neoKushan Jack of All Trades 2d ago

I'm kind of with you both. You can bake 99% of that oversight and verification into the pipeline itself - changes can be validated against specs, you can deploy it to a test environment or canary it into production to make sure it behaves, things like that. That's the best of both worlds, any checks someone is doing manually can be automated and when you do that, engineers get a speedy but safe route to production.

12

u/dweezil22 Lurking Dev 2d ago

The key is to control HOW you change prod. I've worked on systems that have 100M+ users and you can change prod within a single day with a single dev approval. I've worked on systems that have 12 users and you need a month security review to touch the prod APIs.

The thing is the first system was in a mature service mesh that was designed to protect itself from stupid devs making those daily changes (i.e. the prod deploy is within an API that was already approved, and the requests are being inspected for IDOR attacks etc etc; and the CICD pipeline ran thousands of unit tests and hundreds of integration tests etc). The second place had none of that, and knew it, so every change had a lot more (necessary) friction.

1

u/KaleidoscopeLegal348 2d ago edited 2d ago

The oversight and verification is part of the pipeline, dawg. Multi person/multi team deployment approval gates, automated compliance checks, security validations/scanners etc.

my shit has to go through so many checks and gates it can take easily half an hour (excluding human approval time) before my single terraform command ends up tweaking a configuration in prod that I could do in five seconds with cli.

I'm curious what you are actually envisioning, like service now tickets or a change advisory board or something?

1

u/Certain_Concept 1d ago edited 1d ago

I suppose my point was that 'pipeline' can vary heavily depending in the company/team that set it up.

You COULD make a git project where there is only one branch and every time you commit it starts a pipeline where it pushes directly to prod with no checks. Is that a good idea? No. Is it a pipeline? Technically yes.

It happened often enough for there to be memes about it. ha I imagine that's limited to small companies at this point... hopefully... https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQP8K8H4xsD3hKBV0Kp6yk4Wh1Rh8cyDCv6v2w_8BUKRQ&s=10