Building, collecting, and serving assets¶
MDN requires static assets (images, JavaScript, CSS, and other files) to augment the HTML generated by Django. Kuma uses a combination of technologies and techniques to build and deliver static assets. Many of the build and development processes are documented on the Installation and Development documents. This document goes into the details.
The three phases of an asset’s life are Building, Collecting, and Serving.
- Building - Source files are processed to build intermediate and final assets
- Locale files contain translatable strings extracted from source files, are translated by humans, and are compiled to a binary representation for runtime use.
- Localization JavaScript files contain translated strings in an executable format to allow in-browser translation of UI strings.
- CKEditor is packaged with custom plugins for MDN’s use cases.
- Many JavaScript libraries that are used on MDN are included in Kuma’s repository for version control.
- JavaScript bundles, assembled by
django-pipeline
, combine several source files into a single minified JavaScript file. - CSS bundles, assembled by
django-pipeline
orprocess:sass
, combine several source Sass files into a single minified CSS file.
- Collecting - Built assets are collected to the
/static
folder. - Django’s staticfiles provides a framework of Finder and Storage classes for collecting files.
- django-pipeline augments
staticfiles
, to allow compiling, bundling, and minifiying JS and CSS assets in the collection phase.
- Collecting - Built assets are collected to the
- Serving - Collected assets are served to visitors in development and production
- In Django and Jinja templates, the static tag returns
the URL of assets collected by
staticfiles
. - The statici18n tag returns the URL of localization
JavaScript collected by
staticfiles
. - The pipeline tags
javascript
andstylesheet
return the HTML markup for assets processed and collected bydjango-pipeline
. - WhiteNoise serves static assets as part of the Django process.
- In Django and Jinja templates, the static tag returns
the URL of assets collected by
Extracting and building locale files¶
Kuma uses Pontoon to translate strings in the user interface, in error
messages, and in emails. These are stored in the mdn-l10n repository,
and included as a git submodule at locale/
. See the
localization document for more details about locales.
Puente extracts strings to the Portable Object Template (.pot
) files,
as specified in the PUENTE configuration. The
file locale/templates/LC_MESSAGES/django.pot
contains strings from template
files and Python code. The file javascript.pot
contains strings from
JavaScript files. Puente looks for the string parameters of gettext
functions, such as gettext()
, the common alias _()
, and ngettext()
.
It also parses longer strings in the template tag trans
.
Next the changes are merged into the existing Portable Object (.po
) files,
such as locale/fr/LC_MESSAGES/django.po
, to add new strings and comment out
removed strings.
Extracting and merging is done with make localeextract
, usually
during deployment, when UI strings change. This uses the
extract management command provided by Puente, which uses Babel to extract
strings and update the catalog. A maintainer pushes the updated catalogs as a
new commit to the mdn-l10n repository.
Pontoon detects that the repository has changed, and notifies localization
teams that there are new strings. In about 48 hours, the most active teams will
translate strings into the top 10 MDN languages. These are applied by updating
the locale
submodule during the deployment process.
At run time, Machine Object (.mo
) files, such as
locale/fr/LC_MESSAGES/django.mo
, are used by gettext functions, like
gettext()
and _()
, to display the localized strings.
These are built with make localecompile
when creating the production
images or when a developer wants to see updated translations.
Building localization JavaScript¶
Django includes a JavaScriptCatalog view that provides JavaScript
implementations of gettext functions, as well as translations for each
locale. It is ineffecient to use this view directly, since it is generated
on access. For efficiency, django-statici18n generates files for each locale
from the JavaScriptCatalog
output, so they can be served as static assets.
The translation catalog files are created with make compilejsi18n
from the locale Machine Object .mo
files. This calls the
compilejsi18n management command provided by django-statici18n.
Kuma sets STATICI18N_ROOT
to build/locale
, and the output files have
names like build/locale/jsi18n/de/javascript.js
.
Building CKEditor¶
CKEditor is a complex JavaScript application that provides a WYSIWYG editor for MDN wiki pages. It is packaged with plugins, some from third parties, and some custom to MDN.
The CKEditor build process is documented on the CKEditor document. The built files are checked into the Kuma repository.
Including JS libraries¶
Third-party JavaScript libraries are included in the Kuma repository, to avoid ambiguity about what versions of libraries are used. Some libraries were added manually, and others with Bower. See Front-end asset dependencies for more details about these libraries.
Some of these libraries are served directly to visitors, while others are included in pipleline JavaScript bundles.
Building pipeline JavaScript bundles¶
Pipeline JavaScript bundles combine several JavaScript files into a single file, with optional minimization. For example, the file static/build/js/main.js is the combination of 10 JavaScript files:
- kuma/static/js/libs/jquery/jquery.js (JQuery 2.2.0)
- kuma/static/js/libs/icons.js
- kuma/static/js/components.js
- kuma/static/js/analytics.js
- kuma/static/js/main.js
- kuma/static/js/components/nav-main-search.js
- kuma/static/js/auth.js
- kuma/static/js/highlight.js
- kuma/static/js/wiki-compat-trigger.js
- kuma/static/js/lang-switcher.js
The JS bundles are specified in PIPELINE_JS in the Django settings.
The bundles are served differently in “development” and “production” modes.
This is roughly controlled by the Django setting DEBUG
, which sets further
parameters like PIPELINE[PIPEINE_ENABLED]
, and the environment
setting DJANGO_SETTINGS_MODULE
, which switches the Django settings
file. See django-pipeline as well as the pipeline tags
section for details.
In development, the source files (10 for main.js
) are served, so there are
10 <script>
elements in the HTML when {{javascript('main')}}
is
used in a template. In production, the output bundle is used, so a single
<script>
tag appears in the HTML. The single bundle is also processed
with UglifyJS, which removes whitespace, replaces variable names with
shorter names, and performs other transformations to make the file smaller.
Building pipeline CSS bundles¶
Pipeline CSS bundles are conceptually similar to Pipeline JS Bundles. Some contain multiple source files, such as static/build/styles/dashboards.css, which combines:
Source styles are written in Sass, and compiled to CSS with node-sass. These
must be compiled to CSS in both development and production modes. Backend
developers tend to use make build-static
to build and collect these files,
and front-end developers tend to use nom run process:sass
to directly compile them.
The CSS bundles are specified in PIPELINE_CSS in the Django settings.
The bundles are served differently in “development” and “production” modes.
This is roughly controlled by the Django setting DEBUG
, which sets further
parameters like PIPELINE[PIPEINE_ENABLED]
, and the environment
setting DJANGO_SETTINGS_MODULE
, which switches the Django settings
file. See django-pipeline as well as the pipeline tags
section for details.
In development, the source files (2 for dashboards.css
) are used, so there are
2 <link>
elements in the HTML when when {{stylesheet('dashboards')}}
is
used in a template. In production, the output bundle is used, so a single
<link>
tag appears in the HTML. When bundled, CSS is also processed by
clean-css, which transforms the CSS to make the output files smaller.
Collecting asset files with staticfiles¶
Django provides the django.contrib.staticfiles app, widely used in Django projects to standardize where assets are stored, to collect them for development and production, and to use different asset URLs in different environments.
In development mode, the staticfiles
app helps identify assets spread
across the project, and often allows a rapid development cycle (for example,
change a file, refresh the browser, and see the effects of the changed file).
For production, the staticfiles
app provides the management command
collectstatic
, which gathers files to the /static
folder for efficent
file serving.
The Django documents for staticfiles
are mostly focused on usage.
Additional details are needed to understand how django-pipeline
customizes staticfiles
.
Configuration¶
The staticfiles
app is configured by Django settings:
STATIC_ROOT
- The folder on the file system where assets are collected. For MDN, this is
the
static
folder in thekuma
directory. STATIC_URL
- The base URL for static assets. In development, this is
http://localhost:8000/static/
, and in production it ishttps://developer.mozilla.org/static/
. STATICFILES_FINDERS
The dotted path to classes implementing
staticfiles
Finder. These determine what files will be collected and served. Kuma uses four finders:- django.contrib.staticfiles.finders.FileSystemFinder: Finds files
in folders specified by
STATICFILES_DIRS
- django.contrib.staticfiles.finders.AppDirectoriesFinder: Finds
files in the
static
subfolder of any installed apps - pipeline.finders.CachedFileFinder: Strips hashes from filenames to identify the “pre-cached” names for files.
- pipeline.finders.PipelineFinder: When combined assets are not enabled
(
PIPELINE['PIPELINE_ENABLED'] == False
), returns the source files instead of the combined bundle file.
- django.contrib.staticfiles.finders.FileSystemFinder: Finds files
in folders specified by
STATICFILES_DIRS
A list of folders in the
kuma
directory that theFileSystemFinder
will scan for static assets. For MDN, this includes:assets/static
assets/ckeditor4/build
(to/static/js/libs/ckeditor4/build
)kuma/static
kuma/javascript/dist
build/locale
jinja2/includes/icons
For example, the localization JavaScript
build/locale/jsi18n/fr/javascript.js
will be collected tostatic/jsi18n/fr/javascript.js
.STATICFILES_STORAGE
The dotted path to a class implementing
staticfiles
Storage. Storage determines where files are stored, what URLs they have, and provides hooks for modifying files when copying them. Kuma uses three different storages, depending on the context:- Development server (
DEBUG=True
): pipeline.storage.NonPackagingPipelineStorage, which avoids combining files when collecting them. - Production server (
DEBUG=False
): kuma.core.pipeline.storage.ManifestPipelineStorage, which combines packaged files, hashes the names, and creates a manifest. - Testing (
pytest
etc.) andmake
commands: pipeline.storage.PipelineStorage, which combines packaged files but does not hash the names.
- Development server (
Finder classes¶
The staticfiles
app uses Finders to locate asset files. Django considers
this a private API, so it may change in the future. There are two methods the
BaseFinder
class expects to be implemented:
find(path)
: Given a short path likecss/wiki.css
, return the absolute path to the file. This is used by thefindstatic
management command, and to find files when serving assets in development mode.list(ignore_patterns)
: Return a list of the files this Finder can find, along with a storage instance for each. Thecollectstatic
management command uses this to gather files.
The staticfiles
app provides two finders used by Kuma:
- The
FileSystemFinder
collects files under the folders specified in theSTATICFILES_DIRS
setting.- The
AppDirectoriesFinder
collects files in the (optional)static
subfolder of any installed app listed inINSTALLED_APPS
. This is how Django applications, including ones bundled with Django, distribute JavaScript, CSS, images, and other assets. It isn’t used for Kuma’s apps. Instead, we’ve standardized onkuma/static
and other named paths.
The Finders are used by WhiteNoise to determine which file to serve in
development mode. The management command findstatic
can be used to
determine which file is served, such as:
$ ./manage.py findstatic -v2 js/main.js
Found 'js/main.js' here:
/app/kuma/static/js/main.js
/app/static/js/main.js
Looking in the following locations:
/app/kuma/static
/app/build/locale
/app/jinja2/includes/icons
/usr/local/lib/python2.7/site-packages/flat/static
/usr/local/lib/python2.7/site-packages/django/contrib/admin/static
/usr/local/lib/python2.7/site-packages/constance/static
/usr/local/lib/python2.7/site-packages/djcelery/static
/usr/local/lib/python2.7/site-packages/django_extensions/static
/usr/local/lib/python2.7/site-packages/rest_framework/static
/usr/local/lib/python2.7/site-packages/debug_toolbar/static
/app/static
When multiple files are found, the first is used. In the above example,
/app/kuma/static/js/main.js
will be served in development for
/static/js/main.js
.
Storage classes¶
The staticfiles
app uses a Storage
class, which extends
Django’s Storage class for asset workflows. Django documents
how to write a custom storage system, and there are many
3rd-party storage packages for using various cloud providers for file
hosting. The configured STATICFILES_STORAGE
class is used when collecting
files with ./manage.py collectstatic
.
Django’s standard Storage
classes provide methods like delete()
,
exists()
, and size()
for implementing file methods, and methods like
listdir()
for getting lists of files. There is a wide variety of storage
backends with different capabilities, and Django allows most methods to raise
NotImplementedErrror
if an operation is not supported or is too expensive.
A staticfiles
Storage
class extends the standard Storage
classes and
requires a few more methods, although the exact methods are undocumented. Some
are path(name)
, to turn a relative path to a full path, and url(path)
,
to get the external URL of the file. An optional method, post_process()
, can
be defined to further process the files, and returns a map of the old paths to
the new paths.
The default storage, StaticFilesStorage, is based on the standard
FileSystemStorage, and copies static files to STATIC_ROOT
(the static
folder). For the url()
method, it prepends the STATIC_URL
to the path.
ManifestStaticFilesStorage implements the post_process()
method to add the
MD5 hash of the file’s contents to the filename. This allows these files to be
served with very long cache times, since changes will also change the filename.
It also requires manipulating the contents so that references to assets within
other files, such as a CSS @import statement, are updated to the hashed
names. This often requires source files use relative paths like
../img/logo.svg
, so that the tool can find the destination file.
Because of the intense file processing, ManifestStaticFilesStorage
doesn’t
support the live updates of development mode. It requires DEBUG=False
, and
that ./manage.py collectstatic
is run before running the server, or before
a server restart. A map of original to hashed names is stored in
staticfiles.json, and is read at server startup to determine the hashed
names.
CachedStaticFilesStorage is similar to ManifestStaticFilesStorage
, but
stores the filename mapping in the cache. It is slower than
staticfiles.json
, and is used when write access to the filesystem is
forbidden.
django-pipeline¶
The django-pipeline library is used for packing assets. It provides CSS and
JavaScript concatenation and compression, built-in JavaScript template support,
and optional data-URI image and font embedding. It does this by extending and
overriding the django-staticfiles app, so that assets are processed with the
standard ./manage.py collectstatic
command.
Kuma uses django-pipeline
to:
- Compile Sass .sccs files plain CSS with node-sass
- Combine multiple JS and CSS files into a single file (“bundle”) in production
- Compress CSS files with cleancss
- Compress JS files with UglifyJS
Configuration¶
The django-pipeline
app is configured with the dictionary PIPELINE
.
There are many configuration items, some of which are:
PIPELINE_ENABLED
:True
to concatenate and compress assets (testing and production), andFalse
to skip concatenation and compression.PIPELINE_COLLECTOR_ENABLED
:True
to collect assets (testing and production), andFalse
to skip collection and leave them in the source locations.COMPILERS
: A list of CSS compilers.pipeline
‘sSASSCompiler
in testing and production, andkuma.core.pipeline.sass.DebugSassCompiler
(which does nothing, but instead defers tonode-sass
) in development.
The Makefile
specifies the testing configuration, so commands like
make collectstatic
run with PIPELINE_ENABLED
and
PIPELINE_COLLECTOR_ENABLED
. However, they are disabled when running the
development server.
django-pipeline
specifies outputs as a “package”, which specifies one or
more inputs, one output, and some optional settings and overrides.
PIPELINE['JAVASCRIPT']
specifies the JavaScript packages, and
PIPELINE['STYLESHEETS']
specifies the Sass/CSS packages.
Finders¶
Kuma uses two Finders from django-pipeline
.
CachedFileFinder
strips hashes from filenames to identify the
“pre-cached” names for files, by removing the middle element of filenames
with three dots. This may have been useful in django-pipeline 1.3 or earlier,
but it appears to do nothing now, or could potentially do the wrong thing
such as resolving bootstrap.min.js
as bootstrap.js
.
PipelineFinder
does nothing if PIPELINE['PIPELINE_ENABLED']
if
True
(testing and production), and uses the Storage to find files if it
is disabled. For Kuma, this means it may find files in the STATIC_ROOT
directory. However, since the FileSystemFinder finds most files in
kuma/static
first, it is doubtful if this Finder ever applies.
Storage¶
Most of the functionality of django-pipeline
is implemented as a
Storage class, and Kuma uses three different
implementations depending on the environment.
The simplest storage, used during testing and in the Makefile
, is
pipeline.storage.PipelineStorage
, which extends
the staticfiles Storage class
StaticFilesStorage
, with a post_process
step that packages JS and CSS
into one-file bundles, according to the PIPELINE
configuration.
Development uses pipeline.storage.NonPackagingPipelineStorage
.
This works the same way as PipelineStorage
, but avoids creating packages,
where several files are combined into one. JavaScript files are
served from the source folders, but CSS files need to be compiled from Sass,
and are served from the /static
folder after collection. When developing
style files, a developer needs to run ./manage.py collectstatic
to see changes.
In production, kuma.core.pipeline.storage.ManifestPipelineStorage
is used.
This combines the package processing of PipelineStorage
with the hashed
assets and staticfiles.json
of ManifestStaticFilesStorage
. These are
generated when the production Docker containers are created.
Template tag static¶
Django provides a template tag static that outputs the URL of the static
asset for HTML. Without staticfiles
installed, it just adds STATIC_URL
to the start of the path. With staticfiles
, it calls the url(path)
method of the Storage class. In production, with
ManifestStaticFilesStorage
, it uses staticfiles.json
to return a
URLs with hashes in the name.
For example, here is the HTML that includes the Tumbeast in the 404 page:
<div id="beastainer">
<img id="beast404le" src="{{ static('img/beast-404_LE.png') }}" alt="">
<img id="beast404re" src="{{ static('img/beast-404_RE.png') }}" alt="">
<img class="beast 404" src="{{ static('img/beast-404.png') }}" alt="">
</div>
Template tag statici18n¶
The tag statici18n
is provided by django-statici18n. It works like the
static
tag, outputing the URL of the
localization JavaScript. This is included in
<body>
of all page via the base template, near the bottom:
<script src="{{ statici18n(request.LANGUAGE_CODE) }}"></script>
Template tags javascript and stylesheet¶
django-pipeline provides two template tags,
{% javascript('bundle') %}
and {% stylesheet('bundle') %}
, that
can inject the <script>
and <link>
elements into a template.
Bundling is controlled by the setting PIPELINE['PIPELINE_ENABLED']
(False
for development, True
for production). When bundled, the assets
are assumed to be processed and collected, so a single element representing
the final asset URL is inserted. When bundling is off, the assets are assumed
to still be in the source form, and multiple HTML elements are inserted into
the document. These tags look more like Jinja2 calls then HTML, like these
tags from the revision dashboard:
{% block js %}
{% javascript 'jquery-ui' %}
{% javascript 'dashboard' %}
{% endblock %}
django-pipeline
supports other output formats. For example, the
editor-content
bundle is processed with the javascript-array template,
which converts the URLs to a format that can be injected into a JavaScript
array, such as the configuration script:
win.mdn.assets = {
css: {
'editor-content': [
{%- stylesheet 'editor-content' %}
{%- stylesheet 'editor-locale-%s' % LANG %}
],
'wiki-compat-tables': [{% stylesheet 'wiki-compat-tables' %}]
},
js: {
'syntax-prism': [{% javascript 'syntax-prism' %}],
'wiki-compat-tables': [{% javascript 'wiki-compat-tables' %}]
}
};
Serving assets with WhiteNoise¶
WhiteNoise is a static file serving application, and is an alternative to serving static assets with nginx, Apache, or from Amazon S3. On Kuma, it is used to serve static assets in development as well as production. It made it easy to serve HTML and related assets on the same HTTP/2 connection.
In development (DEBUG
= True
) and testing, WhiteNoise is in
“autorefresh” mode, and uses the staticfiles-finder. Each web request to
/static
scans for the file to use, which can be slow, but will catch
any changes made to the files.
In production (DEBUG
= False
), the files in STATIC_ROOT
(/static
) are indexed when the web server starts up. It also
determine headers, such as caching headers and the CORS header, that will be
sent with the file. This makes it very fast to serve static files, but changes
after the web server starts will not be noticed.
WhiteNoise provides its own Storage classes, that can compress and cache static asset files. These are currently unused by Kuma, which uses classes based on those provided by django-pipeline.
Future¶
- Ensure files that are not meant for visitors are not collected, to speed up development, collecting, and preparing production images.
- Remove the
CachedFileFinder
andPipelineFinder
. - Remove
django-pipeline
, usingwebpack
on the server as well before running./manage.py collectstatic
. - Add
django-webpack-loader
or similar to integrate React assets
History¶
The staticfiles
application was probably part of the Kuma project from the
beginning in 2011. In the SCL3 datacenter, one of the first steps of a
production push was collecting the static files to a directory on a network
drive. This was shared between web servers, so that the new assets were
immediately avaiable as the new code was deployed. Because of file hashing, it
was possible to keep old versions of assets along with new versions. These
files were served by Apache.
In 2013, staticfiles
was used to serve assets in the development Vagrant
environment instead of Apache, so that collectstatic
was not needed to
see changes. However, CSS files were converted to Stylus that year, which
required compilation for development and deployment.
In 2015, several changes were made to prepare for the move from SCL3 to AWS.
One change was to move assets from the /media
folder, which is
traditionally used for user uploads, to the /kuma/static
folder.
Another was adopting django-pipeline
to compile assets, and WhiteNoise
to serve them in production.
In 2017, MDN hosting moved from SCL3 to AWS. Apache was no longer used to
serve assets, and WhiteNoise
was used in production as well. This dropped
the ability to serve old versions of assets, but a CDN with long caching times
mitigated issues around deployments. That same year, the CSS sources were
converted from Stylus to Sass.
In 2019, the development team decided to adopt new tools such as React and Webpack (ADR-004).