|
| 1 | +====== |
| 2 | +GridFS |
| 3 | +====== |
| 4 | + |
| 5 | +.. default-domain:: mongodb |
| 6 | + |
| 7 | +.. contents:: On this page |
| 8 | + :local: |
| 9 | + :backlinks: none |
| 10 | + :depth: 2 |
| 11 | + :class: singlecol |
| 12 | + |
| 13 | +Overview |
| 14 | +-------- |
| 15 | + |
| 16 | +In this guide, you can learn how to store and retrieve large files in |
| 17 | +MongoDB using **GridFS**. GridFS is a specification that describes how |
| 18 | +to split files into chunks during storage |
| 19 | +and reassemble them during retrieval. The driver implementation of |
| 20 | +GridFS manages the operations and organization of |
| 21 | +the file storage. |
| 22 | + |
| 23 | +You should use GridFS if the size of your file exceeds the BSON-document |
| 24 | +size limit of 16 megabytes. For more detailed information on whether GridFS is |
| 25 | +suitable for your use case, see the :manual:`GridFS server manual page </core/gridfs>`. |
| 26 | + |
| 27 | +Navigate the following sections to learn more about GridFS operations |
| 28 | +and implementation: |
| 29 | + |
| 30 | +- :ref:`Create a GridFS Bucket <gridfs-create-bucket>` |
| 31 | +- :ref:`Upload Files <gridfs-upload-files>` |
| 32 | +- :ref:`Retrieve File Information <gridfs-retrieve-file-info>` |
| 33 | +- :ref:`Download Files <gridfs-download-files>` |
| 34 | +- :ref:`Rename Files <gridfs-rename-files>` |
| 35 | +- :ref:`Delete Files <gridfs-delete-files>` |
| 36 | +- :ref:`Delete a GridFS Bucket <gridfs-delete-bucket>` |
| 37 | + |
| 38 | +How GridFS Works |
| 39 | +---------------- |
| 40 | + |
| 41 | +GridFS organizes files in a **bucket**, a group of MongoDB collections |
| 42 | +that contain the chunks of files and descriptive information. |
| 43 | +Buckets contain the following collections, named using the convention |
| 44 | +defined in the GridFS specification: |
| 45 | + |
| 46 | +- The ``chunks`` collection stores the binary file chunks. |
| 47 | +- The ``files`` collection stores the file metadata. |
| 48 | + |
| 49 | +When you create a new GridFS bucket, the driver creates the ``chunks`` |
| 50 | +and ``files`` collections, prefixed with the default bucket name ``fs``, unless |
| 51 | +you specify a different name. The driver also creates an index on each |
| 52 | +collection to ensure efficient retrieval of files and related |
| 53 | +metadata. The driver only creates the GridFS bucket on the first write |
| 54 | +operation if it does not already exist. The driver only creates indexes if |
| 55 | +they do not exist and when the bucket is empty. For more information on |
| 56 | +GridFS indexes, see the server manual page on :manual:`GridFS Indexes </core/gridfs/#gridfs-indexes>`. |
| 57 | + |
| 58 | +When storing files with GridFS, the driver splits the files into smaller |
| 59 | +pieces, each represented by a separate document in the ``chunks`` collection. |
| 60 | +It also creates a document in the ``files`` collection that contains |
| 61 | +a unique file id, file name, and other file metadata. You can upload the file from |
| 62 | +memory or from a stream. The following diagram describes how GridFS splits |
| 63 | +files when uploading to a bucket: |
| 64 | + |
| 65 | +.. figure:: /includes/figures/GridFS-upload.png |
| 66 | + :alt: A diagram that shows how GridFS uploads a file to a bucket |
| 67 | + |
| 68 | +When retrieving files, GridFS fetches the metadata from the ``files`` |
| 69 | +collection in the specified bucket and uses the information to reconstruct |
| 70 | +the file from documents in the ``chunks`` collection. You can read the file |
| 71 | +into memory or output it to a stream. |
| 72 | + |
| 73 | +.. _gridfs-create-bucket: |
| 74 | + |
| 75 | +Create a GridFS Bucket |
| 76 | +---------------------- |
| 77 | + |
| 78 | +Create a bucket or get a reference to an existing one to begin storing |
| 79 | +or retrieving files from GridFS. Create a ``GridFSBucket`` |
| 80 | +instance, passing a database as the parameter. You can then use the |
| 81 | +``GridFSBucket`` instance to call read and write operations on the files |
| 82 | +in your bucket: |
| 83 | + |
| 84 | +.. code-block:: javascript |
| 85 | + |
| 86 | + const db = client.db(dbName); |
| 87 | + const bucket = new mongodb.GridFSBucket(db); |
| 88 | + |
| 89 | +Pass your bucket name as the second parameter to the ``create()`` method |
| 90 | +to create or reference a bucket with a custom name other than the |
| 91 | +default name ``fs``, as shown in the following example: |
| 92 | + |
| 93 | +.. code-block:: javascript |
| 94 | + |
| 95 | + const bucket = new mongodb.GridFSBucket(db, { bucketName: 'myCustomBucket' }); |
| 96 | + |
| 97 | +For more information, see the :node-api-4.0:`GridFSBucket API documentation <classes/gridfsbucket.html>`. |
| 98 | + |
| 99 | +.. _gridfs-upload-files: |
| 100 | + |
| 101 | +Upload Files |
| 102 | +------------ |
| 103 | + |
| 104 | +Use the ``openUploadStream()`` method from ``GridFSBucket`` to create an upload |
| 105 | +stream for a given file name. You can use the ``pipe()`` method to |
| 106 | +connect a Node.js ``fs`` read stream to the upload stream. The |
| 107 | +``openUploadStream()`` method allows you to specify configuration information |
| 108 | +such as file chunk size and other field/value pairs to store as metadata. Set |
| 109 | +these options as parameters of ``openUploadStream()`` as shown in the |
| 110 | +following code snippet: |
| 111 | + |
| 112 | +.. code-block:: javascript |
| 113 | + |
| 114 | + fs.createReadStream('./myFile'). |
| 115 | + pipe(bucket.openUploadStream('myFile', { |
| 116 | + chunkSizeBytes: 1048576, |
| 117 | + metadata: { field: 'myField', value: 'myValue' } |
| 118 | + }); |
| 119 | + |
| 120 | +See the :node-api-4.0:`openUploadStream() API documentation <classes/gridfsbucket.html#openuploadstream>` for more information. |
| 121 | + |
| 122 | +.. _gridfs-retrieve-file-info: |
| 123 | + |
| 124 | +Retrieve File Information |
| 125 | +------------------------- |
| 126 | + |
| 127 | +In this section, you can learn how to retrieve file metadata stored in the |
| 128 | +``files`` collection of the GridFS bucket. The metadata contains information |
| 129 | +about the file it refers to, including: |
| 130 | + |
| 131 | +- The ``_id`` of the file |
| 132 | +- The name of the file |
| 133 | +- The length/size of the file |
| 134 | +- The upload date and time |
| 135 | +- A ``metadata`` document in which you can store any other information |
| 136 | + |
| 137 | +Call the ``find()`` method on the ``GridFSBucket`` instance to retrieve |
| 138 | +files from a GridFS bucket. The method returns a ``FindCursor`` instance |
| 139 | +from which you can access the results. |
| 140 | + |
| 141 | +The following code example shows you how to retrieve and print file metadata |
| 142 | +from all your files in a GridFS bucket. Among the different ways that you can |
| 143 | +traverse the retrieved results from the ``FindCursor`` iterable, the |
| 144 | +following example uses the ``forEach()`` method to display the results: |
| 145 | + |
| 146 | +.. code-block:: javascript |
| 147 | + |
| 148 | + const cursor = bucket.find({}); |
| 149 | + cursor.forEach(doc => console.log(doc)); |
| 150 | + |
| 151 | +The ``find()`` method accepts various query specifications and can be |
| 152 | +combined with other methods such as ``sort()``, ``limit()``, and ``project()``. |
| 153 | + |
| 154 | +For more information on the classes and methods mentioned in this section, |
| 155 | +see the following resources: |
| 156 | + |
| 157 | +- :node-api-4.0:`find() API documentation <classes/gridfsbucket.html#find>` |
| 158 | +- :node-api-4.0:`FindCursor API documentation <classes/findcursor.html>` |
| 159 | +- :doc:`Cursor Fundamentals page </fundamentals/crud/read-operations/cursor>` |
| 160 | +- :doc:`Read Operations page </fundamentals/crud/read-operations/>` |
| 161 | + |
| 162 | +.. _gridfs-download-files: |
| 163 | + |
| 164 | +Download Files |
| 165 | +-------------- |
| 166 | + |
| 167 | +You can download files from your MongoDB database by using the |
| 168 | +``openDownloadStreamByName()`` method from ``GridFSBucket`` to create a |
| 169 | +download stream. |
| 170 | + |
| 171 | +The following example shows you how to download a file referenced |
| 172 | +by the file name, stored in the ``filename`` field, into your working |
| 173 | +directory: |
| 174 | + |
| 175 | +.. code-block:: javascript |
| 176 | + |
| 177 | + bucket.openDownloadStreamByName('myFile'). |
| 178 | + pipe(fs.createWriteStream('./outputFile')); |
| 179 | + |
| 180 | +.. note:: |
| 181 | + |
| 182 | + If there are multiple documents with the same ``filename`` value, |
| 183 | + GridFS will stream the most recent file with the given name (as |
| 184 | + determined by the ``uploadDate`` field). |
| 185 | + |
| 186 | +Alternatively, you can use the ``openDownloadStream()`` |
| 187 | +method, which takes the ``_id`` field of a file as a parameter: |
| 188 | + |
| 189 | +.. code-block:: javascript |
| 190 | + |
| 191 | + bucket.openDownloadStream(ObjectId("60edece5e06275bf0463aaf3")). |
| 192 | + pipe(fs.createWriteStream('./outputFile')); |
| 193 | + |
| 194 | +.. note:: |
| 195 | + |
| 196 | + The GridFS streaming API cannot load partial chunks. When a download |
| 197 | + stream needs to pull a chunk from MongoDB, it pulls the entire chunk |
| 198 | + into memory. The 255 kilobyte default chunk size is usually |
| 199 | + sufficient, but you can reduce the chunk size to reduce memory |
| 200 | + overhead. |
| 201 | + |
| 202 | +For more information on the ``openDownloadStreamByName()`` method, see |
| 203 | +its :node-api-4.0:`API documentation <classes/gridfsbucket.html#opendownloadstreambyname>`. |
| 204 | + |
| 205 | +.. _gridfs-rename-files: |
| 206 | + |
| 207 | +Rename Files |
| 208 | +------------ |
| 209 | + |
| 210 | +Use the ``rename()`` method to update the name of a GridFS file in your |
| 211 | +bucket. You must specify the file to rename by its ``_id`` field |
| 212 | +rather than its file name. |
| 213 | + |
| 214 | +.. note:: |
| 215 | + |
| 216 | + The ``rename()`` method only supports updating the name of one file at |
| 217 | + a time. To rename multiple files, retrieve a list of files matching the |
| 218 | + file name from the bucket, extract the ``_id`` field from the files you |
| 219 | + want to rename, and pass each value in separate calls to the ``rename()`` |
| 220 | + method. |
| 221 | + |
| 222 | +The following example shows how to update the ``filename`` field to |
| 223 | +"newFileName" by referencing a document's ``_id`` field: |
| 224 | + |
| 225 | +.. code-block:: javascript |
| 226 | + |
| 227 | + bucket.rename(ObjectId("60edece5e06275bf0463aaf3"), "newFileName"); |
| 228 | + |
| 229 | +For more information on this method, see the :node-api-4.0:`rename() API |
| 230 | +documentation <classes/gridfsbucket.html#rename>`. |
| 231 | + |
| 232 | +.. _gridfs-delete-files: |
| 233 | + |
| 234 | +Delete Files |
| 235 | +------------ |
| 236 | + |
| 237 | +Use the ``delete()`` method to remove a file from your bucket. You must |
| 238 | +specify the file by its ``_id`` field rather than its file name. |
| 239 | + |
| 240 | +.. note:: |
| 241 | + |
| 242 | + The ``delete()`` method only supports deleting one file at a time. To |
| 243 | + delete multiple files, retrieve the files from the bucket, extract |
| 244 | + the ``_id`` field from the files you want to delete, and pass each value |
| 245 | + in separate calls to the ``delete()`` method. |
| 246 | + |
| 247 | +The following example shows you how to delete a file by referencing its ``_id`` field: |
| 248 | + |
| 249 | +.. code-block:: javascript |
| 250 | + |
| 251 | + bucket.delete(ObjectId("60edece5e06275bf0463aaf3")); |
| 252 | + |
| 253 | +For more information on this method, see the :node-api-4.0:`delete() API |
| 254 | +documentation <classes/gridfsbucket.html#delete>`. |
| 255 | + |
| 256 | +.. _gridfs-delete-bucket: |
| 257 | + |
| 258 | +Delete a GridFS Bucket |
| 259 | +---------------------- |
| 260 | + |
| 261 | +Use the ``drop()`` method to remove a bucket's ``files`` and ``chunks`` |
| 262 | +collections, which effectively deletes the bucket. The following |
| 263 | +code example shows you how to delete a GridFS bucket: |
| 264 | + |
| 265 | +.. code-block:: javascript |
| 266 | + |
| 267 | + bucket.drop(); |
| 268 | + |
| 269 | +For more information on this method, see the :node-api-4.0:`drop() API |
| 270 | +documentation </classes/gridfsbucket.html#drop>`. |
| 271 | + |
| 272 | +Additional Resources |
| 273 | +-------------------- |
| 274 | + |
| 275 | +- `MongoDB GridFS specification <https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst>`__ |
| 276 | +- `Runnable example <https://mongodb.github.io/node-mongodb-native/3.6/tutorials/gridfs/streaming/>`__ |
| 277 | + from the Node driver version 3.6 documentation |
| 278 | + |
0 commit comments