Download Files And Zip Them In Your Browsers Using Javascript

Instead of generating zip file and transfer it from your server, why not download data and zip them in your browsers?
Huy Ngo
Aug 18, 2020

I recently worked on a side project, which generates reports per user’s request. For each request, our backend will generate a report, upload it to Amazon S3 storage, and return its URL to the client. Since generating a report takes a while, the output files are stored and the server caches their URLs by request params. If a user orders the same thing, the backend will return the URL of the existing file.

A few days ago, I had a new requirement. Instead of individual files, I needed to download a zip file containing hundreds of reports. The first solution came into my mind is:

  • Prepare the zip file on the server
  • Upload it to a storage
  • Give the client the URL to download it

But this solution has a few drawbacks:

  • The logic to generate a zip file is quite complicated. I need to consider generating all files per request or combining between reusing existing files and generating new ones. Both approaches seem complicated. They will take time to process and require a lot of effort coding, testing, and maintenance later on.
  • It cannot utilize the feature I had already built. Although the Zip files are different sets of reports, it’s very likely that most of the individual reports have been generated by an earlier request. So, while the Zip file itself is not likely reusable, the single files do. With the approach above, I need to redo the whole thing all the time, it’s not really efficient.
  • Generating a zip file takes a long time. Since my backend is a single-thread process, this operation can block other requests for a while and may get timed out during this time.
  • It’s very hard to track processes on the client-side. I love to put a progress bar on the website. If everything is handled at the backend, I need to find an additional approach to report status to the frontend. And this is not easy.
  • I want to save costs for my infrastructure. It’s great if we can shift some computing to frontend and reduce the cost for the infrastructure. My clients won’t mind if they wait for a few more seconds, or spend some extra MB of RAM on their laptop.

The final solution I came up with is: download all the files to the browser and zip them there. In this post, I will work through how I do it.

Disclaimer: In this post, I’ll assume you already had basic knowledge about Javascript and Promise. If you didn’t, I would recommend you to get to know them first and come back here :)

Download a single file

Before applying new solutions, my system allows downloading a single report file. There are many ways to do that. The backend can respond raw file content directly via HTTP request or upload file to another storage and return file URL. I choose the 2nd approach since I want to cache all the generated files.

Once I have a file URL in hand, the work on the client is pretty simple: open this URL in a new tab. The browser will do the rest to download the file.

const downloadViaBrowser = url => {
  window.open(url, '_blank');
}

Download multiple files and store in memory

When it comes to download and zip multiple files, we cannot use the simple method above anymore.

  • If a JS script tries to open many links at once, browsers will wonder if it’s a threat and warn users to block these actions. While the user can confirm to continue, it’s not a good experience
  • You cannot control the downloaded file. The browser manages file content and location

Another way to handle this is by using fetch to download files and store data as Blob in memory. We can then write it to file or combine those blob data into a zip file.

const download = url => {
  return fetch(url).then(resp => resp.blob());
};

This function returns a promise to be resolved as a blob. We can combine with a Promise.all() to download multiple files. Promise.all() will do all the promises at once, and resolve if all of the child promises are resolved or one of them gets an error.

const downloadMany = urls => {
  return Promise.all(urls.map(url => download(url))
}

Download by groups of X files

But what happens if we need to download a huge amount of files at once? Let’s say 1000 files? Using Promise.all() may not be a good idea anymore. Your code will send a thousand requests at once. There are many problems with that approach:

  • The number of concurrent connections supported by OS and browser is limited. Therefore, the browser can only process a few requests at once. The other requests are put in queue, and timeout count. The consequence is, most of your requests will time out before they even get sent.
  • Sending a huge number of requests at a time can also overload your backend

The solution I thought about is to divide the files into multiple groups. Let’s say, I have 1000 files to download. Instead of starting to download them all at once by Promise.all(), I will download 5 files each time. After finishing those 5, I will start another pack. In total, I’ll download 250 packs.

To implement this, we can do a custom logic. Or a simpler way I could suggest is to utilize third-party library bluebirdjs. The library implemented many helpful promise functions. For this use case, I will use Promise.map(). Notice that Promise here now is the custom Promise provided by the library, not the built-in Promise.

import Promise from 'bluebird';

const downloadByGroup = (urls, files_per_group=5) => {
  return Promise.map(
    urls, 
    async url => {
      return await download(url);
    },
    {concurrency: files_per_group}
  );
}

With the implementation above, the function will receive an array of URLs and start to download all URLs, with maximum files_per_group each time. The function returns a Promise, which will resolve when all URLs were downloaded and reject if any of them fail.

Create zip file

Now I have everything downloaded into memory. As I mentioned above, the downloaded content is stored as Blob. The next step is to create a zip file using those Blob data.

import JsZip from 'jszip';
import FileSaver from 'file-saver';

const exportZip = blobs => {
  const zip = JsZip();
  blobs.forEach((blob, i) => {
    zip.file(`file-${i}.csv`, blob);
  });
  zip.generateAsync({type: 'blob'}).then(zipFile => {
    const currentDate = new Date().getTime();
    const fileName = `combined-${currentDate}.zip`;
    return FileSaver.saveAs(zipFile, fileName);
  });
}

Final code

Let’s finalize all the code I’ve done for this here.

import Promise from 'bluebird';
import JsZip from 'jszip';
import FileSaver from 'file-saver';

const download = url => {
  return fetch(url).then(resp => resp.blob());
};

const downloadByGroup = (urls, files_per_group=5) => {
  return Promise.map(
    urls, 
    async url => {
      return await download(url);
    },
    {concurrency: files_per_group}
  );
}

const exportZip = blobs => {
  const zip = JsZip();
  blobs.forEach((blob, i) => {
    zip.file(`file-${i}.csv`, blob);
  });
  zip.generateAsync({type: 'blob'}).then(zipFile => {
    const currentDate = new Date().getTime();
    const fileName = `combined-${currentDate}.zip`;
    return FileSaver.saveAs(zipFile, fileName);
  });
}

const downloadAndZip = urls => {
  return downloadByGroup(urls, 5).then(exportZip);
}

Conclusion

  • Utilize client’s power is sometimes very useful to reduce workload and complexity for backend
  • Don’t send a huge amount of requests at a time. You can run into trouble at both the frontend and backend side. Instead, divide the works into small chunks.
  • Introduce some third party libraries bluebird, jszip, and file-saver. They worked well for me and could also be helpful for you :)