If we reinvent the wheel, it’s safe to say that initially it probably won’t run as smoothly as the one that’s been around for more than 6,000 years. So if all you need is a wheel and you’re not trying to sell a new wheel, it’s a good idea to stick with the existing design.
The same goes for software. If you just need a functionality, the best solution is usually to use something that already exists, a library that has already implemented it.
But this leads to new problems. You have to trust software, that you have not written yourself, and vulnerabilites in popular libraries suddenly do not affect only one product, but thousands. How can we ensure quality?
You can check every single library manually, that you are using. Or you can hire us to do it for you. But while both of it is a good idea, it is not realistic to do this for every library out there or probably even every library that is used in one of your products. Because there are a lot of them, even if you only consider open source software.
Therefore we need something, that can help us speed things up a bit. This thing is static code analysis. While this can not replace a human and still needs humans to review the results, we can analyze huge amounts of code in a reasonable time.
Bandit
One tool that does this for Python that is fast and easy to use is Bandit. How easy to use? Let me demonstrate with the example of CVE-2023-25392, which I found using Bandit.
This vulnerability affects BigFlow prior to version 1.6.0. So let’s start with getting the latest vulnerable version:
$ git clone https://github.com/allegro/bigflow.git
cd bigflow
git checkout 1.5.4
If you don’t already have bandit
, you can install it via pip
:
$ pip install bandit
Now we run Bandit on the code we already checked out:
$ bandit -r bigflow
With the arguments -lll
(severity) and -iii
(confidence) we can filter the results to see only issues of high severity and high confidence. However, even though it’s a good idea to start with this kind of filter, the other issues might be also interesting.
In our example we get three high/high issues:
- A call to
tarfile.extractall
- A jinja2 environment without autoescaping
and the issue we are interested in today:
>> Issue: [B501:request_with_no_cert_validation] Call to requests with verify=False disabling SSL certificate checks, security issue.
Severity: High Confidence: High
CWE: CWE-295 (https://cwe.mitre.org/data/definitions/295.html)
More Info: https://bandit.readthedocs.io/en/1.7.5/plugins/b501_request_with_no_cert_validation.html
Location: bigflow/deploy.py:266:15
265 headers = {'X-Vault-Token': vault_secret}
266 response = requests.get(vault_endpoint, headers=headers, verify=False)
267
Bandit is already providing us with the location of the issue in bigflow/deploy.py
:
def get_vault_token(vault_endpoint: str, vault_secret: str) -> str:
if not vault_endpoint:
raise ValueError('vault_endpoint is required')
if not vault_secret:
raise ValueError('vault_secret is required')
headers = {'X-Vault-Token': vault_secret}
response = requests.get(vault_endpoint, headers=headers, verify=False)
if response.status_code != 200:
logger.info(response.text)
raise ValueError(
'Could not get vault token, response code: {}'.format(
response.status_code))
logger.info("get oauth token from %s status_code=%s", vault_endpoint, response.status_code)
return response.json()['data']['token']
The function get_vault_token
is sending the header X-Vault-Token
with the vault secret to the vault endpoint without verifying the server certificate. This allows any MitM1 attacker to read the vault secret and gain access to the vault.
Up to here, Bandit did most of the work. Our job is now to understand the impact of the vulnerability. For that, we start looking for calls to get_vault_token
. There are to of them:
bigflow.deploy.authenticate_to_registry
bigflow.deploy.create_storage_client
If we investigate further, we can see, that the following entry points have code paths, that lead to the vulnerable function:
bigflow.cli._cli_deploy_dags
(viabigflow deploy
andbigflow deploy-dags
)bigflow.cli._cli_deploy_image
(viabigflow deploy
)bigflow.cli._cli_build_image
(viabigflow build-image
)
By doing this, we could show, that the vulnerability is indeed relevant and could be exploited.
I reported this vulnerability to the developers in December 2022 and it was fixed with the release of BigFlow 1.6.0.
Can static code analysis replace a manual security analysis? No, definitely not. But it is fast and it is scalable. If we continue to use random libraries from the internet (and believe me, we will), doing a basic security scan of our dependencies is the least we can do. That we can’t do everything right is no excuse for not doing anything.
Integrate static code analysis into your workflow (ideally via your CI pipeline) and check your own code and its dependencies.
A man/monster/machine-in-the-middle (MitM) attack is an attack where an adversary “sits in between” the two ends of the communication and can read or even manipulate any message. ↩︎