A clean scan is not a clean bill of health
You ran a vulnerability scanner against your application, or a vendor ran one and sent you the report. It came back clean, or close to it, and everyone exhaled. The application is tested, the box is checked, and you can tell a customer the security questionnaire is handled.
The trouble is that the scan answered a narrower question than the one you heard. A scanner checks whether your application matches a list of known-bad patterns. It does not check whether the application does what it is supposed to do and refuses what it is supposed to refuse. Those are two different questions, and the second one is where breaches actually live.
This is not an argument against scanners. We run them, they earn their keep, and we will get to exactly what they are good at. The point is to be clear about where the floor ends and the ceiling begins, so a clean report doesn't get read as a guarantee it was never built to make.
What a scanner is actually good at
A scanner is fast, tireless, and reliable at the things it was designed to catch. Point it at an application and it will check thousands of known signatures in the time it takes to get coffee, and it will do it the same way every time. For a whole category of real problems, that is exactly what you want.
It catches outdated and vulnerable libraries, the dependency three versions behind with a published flaw. It catches missing or misconfigured security headers. It catches known vulnerabilities with a public identifier (a CVE, the industry's catalog number for a specific known flaw). It catches the obvious injection patterns, the input fields that fail in textbook ways. These are real findings, they get exploited, and you want them found and fixed.
So run scanners. Run them often, fold them into your build pipeline, and treat the output as a baseline you keep clean. The mistake is not running them. The mistake is thinking the clean result means the testing is done. A scanner sets the floor. It does not reach the ceiling, and the next sections are about the part of the room it can't see.
The flaw a scanner can't see: who's allowed to do what
The single most common serious flaw in web applications is broken access control: the application lets a user do or see something they should not be allowed to. In the OWASP Top 10:2025, broken access control holds the number one position, and in that dataset every application tested was found to have at least one instance of it, across more than 1.8 million total occurrences. (OWASP is the open, vendor-neutral community that publishes the industry's reference list of the most critical web application risks.)
A scanner misses it for a structural reason. When a scanner sends a request and the server replies 200 OK with a valid-looking response, the scanner reads that as success and moves on. It has no model of who you are, who you should be, or what this particular account is allowed to see. It cannot tell the difference between you reading your own record and you reading a stranger's, because both requests are structurally identical and both succeed.
OWASP says this plainly in its testing guidance. To test for the most common version of this flaw, the recommended method is to use at least two different user accounts and check whether one can reach what only the other should. A scanner runs as a single session. It does not arrive knowing that in your specific application, customer 1234 must never see customer 1235's invoice.
The illustrative version is almost boring in how simple it is. You are logged in, looking at your own account at a web address that ends in id=1234. You change it to 1235 and the application hands you another customer's record. Every part of that request is valid. The application simply never checked whether you were allowed to ask. A human tester with two accounts finds this in minutes. A scanner sees two successful requests and reports nothing wrong.
Breaking the rules the app assumes you'll follow
Past access control sits an even harder class for tools: business logic flaws. These are the cases where you follow the application's own steps, but in an order or with a value the designers never intended, and the application goes along with it.
The shapes are familiar once you see them. You skip a step in the checkout flow that was supposed to come first. You submit a negative quantity on an order and the system issues you a credit instead of a charge. You replay a one-time code or a password-reset token that was supposed to work exactly once. You change a hidden price field and the application bills you the number you set. You upgrade to a paid plan without a valid payment ever clearing. None of these involve a malformed or malicious-looking request. Each one is the application doing exactly what it was told, by someone who understood the rules well enough to bend them.
A scanner waves all of this through, because every request is structurally valid. OWASP's testing guide states it directly: this type of vulnerability "cannot be detected by a vulnerability scanner and relies upon the skills and creativity of the penetration tester." This is not a gap a better scanner closes next year. A tool can confirm a process works when used correctly. It has no way to know what counts as incorrect for your specific business, because that lives in your intent, not in your code's syntax.
This blind spot also happens to be where attackers are heading. According to Imperva's State of API Security in 2024, business-logic attacks were the largest single category of API attacks in 2023, at about 27%, up from roughly 17% the year before. The fastest-growing target is the one a scanner structurally cannot represent.
Small findings that add up to a breach
Real attacks rarely come from one dramatic flaw. They come from a few unremarkable ones, lined up in the right order. A stack trace leaks an internal user ID. The IDs turn out to be sequential and easy to guess. A separate endpoint forgets to check authorization. On its own, each of those is a low-severity note. Chained together by someone who understands the application, they become account takeover.
A scanner scores findings in isolation. It rates the information leak as minor, the predictable IDs as informational, the missing check as moderate, and never connects them, because connecting them requires understanding what the application is for. The chain is invisible to a tool that grades one finding at a time. A human tester is specifically looking for the path that runs through all three.
There is a quieter cost too. A scanner does not just miss the dangerous findings. It also reports a meaningful share of false positives, things that look like flaws but are not exploitable in practice. Someone has to sit with the report, separate the real from the noise, and confirm what an attacker could actually do with each item. A manual test inverts that. Instead of a long list of theoretical findings to triage, you get the handful that a person already proved can be exploited, with the steps to reproduce them.
Before you trust a "clean" report
So when your own team or a vendor hands you a clean report and calls the application tested, the useful move is to ask what kind of testing it was. "We got scanned" and "we got tested" are different claims, and the difference is exactly the part that matters most.
You do not need to be a security specialist to ask the right questions. You only need to know that a scan and a human test answer different things, and to check which one you are actually being shown.
If the answer is "we ran a scan and it was clean," that is really worth something, and it is not the same as knowing your authorization logic holds up. The questions above tell you which one you have.
Where a real test fits
A scoped, manual, OWASP-aligned test is built to find the things in this post. A person works through your application as a real user would, with more than one account, looking for the access control gaps, the business logic it will go along with, and the small findings that chain into something larger. That is work a scanner cannot do, because it requires understanding what your application is supposed to prevent.
The test is only worth as much as what happens after it, though. A findings report that lands in a shared drive and dies there fixes nothing. Because we are one connected team, the findings don't stop at a PDF: they feed straight into remediation, and the patterns we see show up in what we watch for afterward in your detection and response. The test tells you where the doors are. Keeping them shut, and noticing when someone tries one, is the ongoing part.
Scope a manual, OWASP-aligned web app test
Penetration Testing
Web application penetration testing by security engineers, aligned to the OWASP Web Security Testing Guide. Each engagement is planned around the systems that matter most, including network or infrastructure testing when needed.
If your "clean" report came back as a vendor security questionnaire response and you are not sure it proves what it claims, that same gap shows up in compliance evidence, and it is the same theme we cover in we passed SOC 2 and still got breached. It is the pattern behind why "we have antivirus" isn't endpoint management too: a known tool mistaken for coverage it was never built to provide.