The Dunyazad Digital Library

About the Library

The Reading Room

The Audio Wing

Support the Library

Other Stuff ...

Plain Text Files in the Dunyazad Digital Library

I provide the plain text versions of the Dunyazad Library books for four reasons:

– it is the most widely compatible digital format, now and in all the foreseeable future;

– you can, at least in principle, convert it to any e-book format there is or ever will be;

– it allows you to perform complex search operations, using regular expressions, that are usually not supported by e-book readers;

– using this format, you can relatively easily convert your own plain text files to the e-book format of your choice (see below).

The Dunyazad Plain Text Format

I use a tagged text format which is deliberately kept very simple – for lack of a better name, let’s call it Dunyazad Text Format. It is not a markup language, and it is not intended to support typographical or layout niceties – its main purpose is to be readable “as is” in plain text, without demanding any learning from the reader, but it provides some essential formatting options that help with reading simply structured text as will usually be found in fiction.

A line of text is a paragraph.

A blank line is a blank line.

A line consisting only of one ~ character is a separator within a chapter (scene break).

» This is heading level 1.

»» This is heading level 2.

»»» This is heading level 3.

The first line is the title, it has to be a level 1 heading.

Level 1 headings (except for the title) are preceded by two blank lines and followed by one blank line.

Level 2 and 3 headings are preceded by single blank lines.

Quotes, lyrics are preceded and followed by single blank lines.

A line before the first blank line that begins with "by" (case sensitive, followed by a space) contains the name of the author.

The underscore character toggles normal/ialic text. Line (paragraph) ends do not reset italic to normal!

Footnote references are enclosed in curly brackets { }.

Footnotes are enclosed in curly brackets, beginning with the reference number or symbol followed by a colon and a space. A footnote may have several paragraphs.

Footnotes immediately follow the paragraphs in which they are referenced, or, in the case of poetry, the poems.

Character set

The character set that I use is Windows 1252. Unless the text contains non-standard (e.g. German, French or Spanish) characters, only four symbols are used that are not also part of the standard ISO-8859-1 character set: typographic quotation marks and apostrophes. If your system does not display these characters correctly, you can easily replace them using any text editor.

For better compatibility and for better readability with monospaced fonts, M-dashes are represented as double hyphens, and ellipses as three dots. If your system supports Windows 1252, and if you use a proportional font, you can easily convert them back to the typographically correct symbols.

Symbols outside the scope of Windows 1252 are replaced by standard Latin characters. Text in non-Latin alphabets (e.g. Greek) is lost, as are illustrations, and some typographical or layout details.

Converting to HTML

You can download a tool to convert Dunyazad Text files to HTML, and you can use it for your own files if they follow the above rules. dthtm.exe has to be run from the Windows command line. You find the details in the file dthtm.txt that is included in the download zip file.

Download dthtm (exe, doc and source files). No installation required, just unpack the zip archive and run dthtm.exe with the appropriate arguments. You can safely ignore or delete the two source code files.

Current version: 1.2 (Jan 16, 2020)

Requirements: Any not too ancient version of Windows, and a basic knowledge of how to use the command line.

You may use dthtm freely for non-commercial purposes, but you may not re-distribute the program without my permission.

If you want to create your own Dunyazad-formatted text files, please note:

For headings you can use one of the five symbols $ . * # = instead of »

For scene breaks you can use * or * * * instead of ~

Blank lines before headings are not required.

An URL gets converted to a link if it is a line by itself. It can start with "http://", "https://" or "www.", must not contain spaces, but may contain underscore characters.

Converting HTML to ePub or Mobi

From the HTML format you can (for instance) use calibre to convert the file to any of your favorite e-book formats, e.g. ePub or Mobi. When configured correctly (using //h:h1 and //h:h2 for heading levels 1 and 2), the resulting e-book will have a properly working TOC.

From the command line, you can use these commands (replace "source" and "target" with the file names):

ebook-convert source.htm target.epub --epub-inline-toc --level1-toc //h:h1 --level2-toc //h:h2 --page-breaks-before //h:h1 --language en

ebook-convert source.htm --level1-toc //h:h1 --level2-toc //h:h2 --page-breaks-before //h:h1 --language en

You can, of course, use the calibre GUI instead of ebook-convert.exe. You can also create an ePub file with OpenOffice or LibreOffice, when you have installed the writer2epub extension:

Create a new text document, insert the .htm file, and export to ePub, after setting the appropriate meta data and preferences.


Any HTML code can be embedded in the text, except heading and paragraph tags.

Footnotes do not get linked, they appear as and where they are in the text.

Feel free to contact me with any questions, suggestions or comments you may have.


Back to the “? About the Library” page

Back to the “… Other Stuff” page