This guide is for you if you want to migrate a mailing list to Discourse.
This guide is still a work in progress!
1. Importing using Docker container
This is the recommended way for importing content from your mailing lists into Discourse.
1.1. Installing Discourse
Install Discourse by following the official installation guide.
Afterwards it’s a good idea to go to the Admin section and configure a few settings:
- Enable
login_required
if imported topics shouldn’t be visible to the public
- Enable
hide_user_profiles_from_public
if user profiles shouldn’t be visible to the public.
- Disable
download_remote_images_to_local
if you don’t want Discourse to download images embedded in posts.
- Enable
disable_edit_notifications
if you enabled download_remote_images_to_local
and don’t want your users to get lots of notifications about posts edited by the system user.
- Change the value of
slug_generation_method
if most of the topic titles use characters which shouldn’t be mapped to ASCII (e.g. Arabic). See this post for more information.
The following steps assume that you installed Discourse on Ubuntu and that you are connected to the machine via SSH or have direct access to the machine’s terminal.
1.2. Preparing the Docker container
Copy the container configuration file app.yml
to import.yml
and edit it with your favorite editor.
cd /var/discourse
cp containers/app.yml containers/import.yml
nano containers/import.yml
Add - "templates/import/mbox.template.yml"
to the list of templates. Afterwards it should look something like this:
templates:
- "templates/postgres.template.yml"
- "templates/redis.template.yml"
- "templates/web.template.yml"
- "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
#- "templates/web.ssl.template.yml"
#- "templates/web.letsencrypt.ssl.template.yml"
- "templates/import/mbox.template.yml"
That’s it. You can save the file, close the editor and build the container.
/var/discourse/launcher stop app
/var/discourse/launcher rebuild import
Building the container creates an import
directory within the container’s shared directory. It looks like this:
/var/discourse/shared/standalone/import
├── data
└── settings.yml
1.3. Configuring the importer
You can configure the importer by editing the example settings.yml
file that has been copied into the import
directory.
nano /var/discourse/shared/standalone/import/settings.yml
The settings file is well documented and comes with sensible defaults, but here are a few tips anyway:
-
The settings file contains multiple examples on how to split data files:
- mbox files usually are separated by a
From
header. Choose a regular expression that works for your files.
- If each of your files contains only one message, set the
split_regex
to an empty string.
- There’s also an example for files for the popular Listserv mailing list software.
-
prefer_html
allows you to configure if the import should use the HTML part of emails when it exists. You should choose what suits you best – it heavily depends on the emails sent to your mailing list.
-
By default each user imported from the mailing list is created as staged user. You can disable that behaviour by setting staged
to false
.
-
If your emails do not contain a Message-ID
header (like messages stored by Listserv), you should enable the group_messages_by_subject
setting.
1.4. Prepare files
Each subdirectory of /var/discourse/shared/standalone/import/data
gets imported as its own category and each directory should contain the data files you want to import. The file names of those do not matter.
Example: The import
directory should look like this if you want to import two mailing lists with multiple mbox files:
/var/discourse/shared/standalone/import
├── data
│ ├── list 1
│ │ ├── foo
│ │ ├── bar
│ ├── list 2
│ │ ├── 2017-12.mbox
│ │ ├── 2018-01.mbox
└── settings.yml
1.5. Executing the import script
Tip: It’s a good idea to start the import inside a tmux or screen session so that you can reconnect to the session in case of SSH connection loss.
Let’s start the import by entering the Docker container and launching the import script inside the Docker container.
/var/discourse/launcher enter import
import_mbox.sh # inside the Docker container
Depending on the size of your mailing lists it’s now time for some or
The import script will show you a message like this when it’s finished: Done (00h 26min 52sec)
Tip: You can abort the import anytime you want by pressing Ctrl+C
When you restart the import it will continue where it left off.
You can exit and stop the Docker container after the import has finished.
exit # inside the Docker container
/var/discourse/launcher stop import
1.6. Starting Discourse
Let’s start the app container and take a look at the imported data.
/var/discourse/launcher start app
Discourse will start and Sidekiq will begin post-processing all the imported posts. This can take a considerate amount of time. You can watch the progress by logging in as admin and visiting http://discourse.example.com/sidekiq
1.7. Clean up
So, you are satisfied with the result of the import and want to free some disk space? The following commands will delete the Docker container used for importing as well as all the files used during the import.
/var/discourse/launcher destroy import
rm /var/discourse/containers/import.yml
rm -R /var/discourse/shared/standalone/import
1.8. The End
Now it’s time to celebrate and enjoy your new Discourse instance!