Sanitizing File Uploads Caveat
Reading the OWASP cheat sheet on input validation I have come across the rule, which can do a lot of harm, when not used wisely. The rule to use "image rewriting libraries to verify the image is valid and to strip away extraneous content" can lead to the use of insecure code in your web application.
We all remember the not-yet-over series of Stagefright bugs in Android. I say "not-yet-over" because it is extremely hard to secure the highly polymorphic and at the same time low-abstraction-level C++ in which this complex library is written. In principle, sanitizing the images with a-like library on your web server opens the way to vulnerabilities similar to Stagefright.
Disclaimer On this web site you might read about
or get access to various kinds of software and technology, including but not limited to libraries,
operating systems, software for communications, mobile phones and tablets,
Android software and Linux, even cars and motorcycles, security and penetration testing software,
software used in security research and forensics, some samples of software which can be
used (elsewhere) for malicious or illegal purposes. You will read about or be provided with
the ways to change it, to operate it and to use it. You might find advice and recommendations,
which are only an opinion, and not a legal advice or commercial recommendation..
Bear in mind, please, that everything you do, you do solely at your own risk
and responsibility. In no way the author of this web site, information, graphics
and other materials presented here or related to it can be made liable or
anyhow else responsible for your own actions as well as
actions of any third party and their direct or indirect results or consequences
with or without the use of this information as well as the software,
technology and systems mentioned and/or presented here,
no matter if developed by the author or by any third party.
In no way it is guaranteed that you will meet any suitability for
any particular purpose, safety, security, legality or even simply
functioning of the software and systems described here. You have to make
sure each time yourself, whether what you do, is really what you intend to
do, and that you are ready to be yourself responsible for. All the recommendations
and experiences described here are the opinions of corresponding authors and
are to be taken with care and own full responsibility.
The software provided
on or through this web site, linked to from this web site or anyhow else
related to this web site is provided by the corresponding authors on their
own terms. We provide all the software here as is without any guarantees to you.
You are responsible for deciding whether it is suitable for you or not.
You are also responsible for all direct or indirect consequences of using this
software.
Other web sites linked to from the current one are out of the author's control,
we can not guarantee anything about their content, its quality or even legality. We
can not be liable for any use of the linked to web sites or of the information presented there.
We reasonably try to keep this website running smoothly and to deliver
information to the best of our knowledge corresponding to the state of the art at the times
when the information is composed, usually presented together with the information, and out of good intents.
We can not however guarantee and can not be liable for this website being temporarily or permanently
unavailable, presenting unreliable information or software, or any other similar or not malfunctioning
or functioning not up to your expectations as well as any consequences
which might result from this site's operation.
Why the Stagefright series of bugs is not over?
Let's have a look on some code:
This code taken from here is a part the libstagefright. It is parsing OggVorbis comments in a given media file. A one quick look on it is enough to expect this code to be insecure.
For instance:
- The commentLength parameter is trusted by this code, it is not ensured however in some use cases.
- The read from comment[tagLen] might be a read out of allocated memory for the comment.
- The extractAlbumArt parser is called, is it secured?
- In the last else branch comment is not tested to terminate with \0 - can be a read out of bounds.
With all due respect to the developers of this code (it was not written with the goal to be secure) when we talk security, this code is a mess!
The best about this piece of code is, that it is taken by me purely randomly from the libstagefright. You can find Megabytes of such codes around! A good expert on designing exploits should not need more than an hour to attack such code effectively.
What goals do such media libraries have? They have to efficiently operate media data. They have to be fast, have low resource consumption, they should have a large amount of features. As a result the software which uses them (like a media player) will be fast, battery saving, universal, but... not secure! Which is maybe fine for a sand-boxed media player in some scenario, but can turn into a disaster on your server!
Sanitizing files safely
So what you want to do, is to make sure you implement file sanitizing safely. This is a very hard task, but let me share my intuition with you on how to start solving it.
The best first step would be to use, if available, a library, which was developed with security in mind. It is highly unlikely to get enough of such libraries. So probably you will have to stick to an insecure parsing library, and will have to sandbox it.
Next, sandboxing. The validation should run somewhere safe. Read "not on the web server itself". A potential solution could be this. Send a file to validate to a trusted checking server, communicate from your web server to the checker in a secure manner, validate this communication. Ensure security of the checking server itself!
Conclusion
Not validating your input is a bad idea. But validating it with insecure code might endanger your server worse than a sample of unvalidated input. Take the OWASP rules wisely.
Images copyright information
The image on this website is a taken made by me screenshot of code from here.It is licensed under Creative Commons Attribution 3.0 International license: http://creativecommons.org/licenses/by/3.0/
Thanks for reading my blog!