Improve CSV gzip decompression safety and error reporting#1587
Improve CSV gzip decompression safety and error reporting#1587Quillen09 wants to merge 1 commit intoOpenEnergyDashboard:developmentfrom
Conversation
huss
left a comment
There was a problem hiding this comment.
@Quillen09 Thank you for your first contribution to OED. I have made some comments to consider. Also, the description discussed unit tests but I don't see any tests. Could you explain or add them, esp. given this is security related code. Finally, can you explain your statement in the description that OED already enforced file limits but did not return a clear error message. I think it should be part of the record and an understanding of how this relates to that. Please let me know if anything is not clear, you have thoughts or would like any assistance.
| * @returns {Promise<string>} - Full path of decompressed csv file | ||
| */ | ||
| async function gunzipToFileWithLimit(inputPath, outputDir, outputFilename, maxBytes) { | ||
| await fsp.mkdir(outputDir, { recursive: true }); |
There was a problem hiding this comment.
I'm unclear if OED creates new directories when it uploads files. If not, it would probably be more secure not to do this. Also, I'm unclear, given this seems to be a hard coded output directory, that it would not allow writing onto inappropriate directories.
Also, I wonder if the size of the zipped file should be checked. I think it may be safe to assume that if it is larger than the desired output size then the unzipped file would be larger than desired. That would stop the saving of the file. This may happen earlier but enforcing here seems a good idea.
| return new Promise((resolve, reject) => { | ||
| const source = fs.createReadStream(inputPath); | ||
| const gunzip = zlib.createGunzip(); | ||
| const dest = fs.createWriteStream(outPath, { flags: "wx" }); // fail if file exists |
There was a problem hiding this comment.
OED prefers comments on the line above code and not the same line.
| reject(err); | ||
| }; | ||
|
|
||
| source.on("error", cleanup); |
There was a problem hiding this comment.
I'm unclear on how the err is begin passed to cleanup by these. I may be missing something and happy for you to explain.
|
|
||
| const cleanup = async (err) => { | ||
| try { await fsp.unlink(outPath); } catch (_) {} | ||
| reject(err); |
There was a problem hiding this comment.
I'm wondering if this should be in a finally block. If so, would the catch be needed?
I'm not an expert here but if this code issues a reject on the first call then would the other cleanup calls necessarily get executed/completed?
| @@ -0,0 +1,54 @@ | |||
| const zlib = require("zlib"); | |||
There was a problem hiding this comment.
This file lacks the MPL comment in all files.
| const fsp = require("fs").promises; | ||
| const path = require("path"); | ||
|
|
||
| /** |
There was a problem hiding this comment.
The code is space indented but OED wants tabs.
|
|
||
| dest.on("finish", () => resolve(outPath)); | ||
|
|
||
| source.pipe(gunzip).pipe(dest); |
There was a problem hiding this comment.
This file lacks comments inside the function and I think they should be added. This is definitely an example of code that should be explained.
| * @param {string} inputPath - Path to uploaded gzip file | ||
| * @param {string} outputDir - Directory to write decompressed csv | ||
| * @param {string} outputFilename - Filename to write | ||
| * @param {number} maxBytes - Max allowed decompressed output size in bytes |
There was a problem hiding this comment.
I think maxBytes should have a default value that enforces a reasonable limit. I also think this function should put some max limit on the size that can be passed.
| * @param {number} maxBytes - Max allowed decompressed output size in bytes | ||
| * @returns {Promise<string>} - Full path of decompressed csv file | ||
| */ | ||
| async function gunzipToFileWithLimit(inputPath, outputDir, outputFilename, maxBytes) { |
There was a problem hiding this comment.
I don't see this function used anywhere so I'm unclear on how it is actually used to enforce limits. For example, how does it relate to the gunzip usage in src/server/routes/csv.js and how it is done there.
|
Sorry to add something after I submitted the review. I believe there are several other aspects of potential malicious behavior of gzip files to address. I suspect some are fairly straightforward to implement. You note some in the description. Given this, I waned to know if you planned to address those. If not, I will have to decide if this should be merged or if it should wait until further enhancements are made. |
Description
This pull request improves protection against gzip-based denial‑of‑service attacks
during CSV uploads by making the decompression limit explicit in the failure message.
Previously, oversized gzip uploads were rejected with a generic error that did not
indicate what limit was exceeded. This change updates the gunzip utility to include
the configured maximum decompressed size (in bytes) in the error message and adds
unit-level test coverage to prevent regressions.
Type of change
Checklist
OED Contributing License Agreement and each author is listed in the Description section.
Limitations
This change focuses on error clarity and regression coverage only. The configured
decompression limit value is unchanged, and no additional rate‑limiting or
throttling behavior has been introduced.