Actions

Retro CDN

PDF magazines

From Retro CDN

Revision as of 11:43, 25 May 2020 by Black Squirrel (talk | contribs) (Created page with "Retro CDN hosts magazine scans in [https://en.wikipedia.org/wiki/PDF PDF (Portable Document Format) format]. The following explains why the Retro wikis have chosen this format...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Retro CDN hosts magazine scans in PDF (Portable Document Format) format. The following explains why the Retro wikis have chosen this format as a means of representing magazines over the alternatives.

PDF was invented by Adobe Systems Incorporated in 1993, before becoming standardised as ISO 32000 in 2008. The format exists as a means of displaying documents independent of software, hardware and operating system, meaning provided your PDF viewer adheres to the standard, all PDF files should look the same on every supported device. Prior to PDF, this was not always guaranteed, as word processors were produced by many different vendors for many different platforms.

Usage on the Retro wikis

Each of the Retro wikis supports the viewing of PDF files in-browser. Aside from magazine scans, this was required because many of the files we host are genuine PDFs (including digital magazines!), where position and formatting of each page actually matters. The support was always going to be there, so it was decided to use the format to display magazine scans.

The PDF specification allows for JPEG images to be embedded in PDF documents. By scanning each page of a magazine as a JPEG, it can them be embedded into a PDF file to replicate the construction of the magazine, i.e each page of a PDF represents each page of the magazine. We have a program which does just that - take a set of JPEGs and produce a PDF. This program does not compress or alter the given JPEGs further - the images do not change, they are just presented differently.

The outcome is a digital representation of an entire magazine as just one file. If created correctly, the page numbers will line up, and the aforementioned support on our wikis means each individual page will be viewable without the need for extra downloads. If the wiki fails, all major internet browser vendors support the viewing of PDF files natively, and it is common to see Adobe Reader (or a third-party alternative) installed on a user's PC.

The alternatives

ZIP/RAR/7Z archives

A common means of distributing scanned magazines is to compile each page into an archive, typically ZIP, RAR or 7Z, which are all widely used. Prior to gaining PDF support, this was the Retro wikis' format of choice, but there are several limitations:

  • The user is forced to download the entire archive, when they may just want to see one page
  • There is no means of viewing the contents online
  • RAR and 7Z are not natively supported by operating systems, meaning extra software is required to decompress these archives

These formats are also commonly used to compress files, which is good for internet usage where bandwidth is a premium. However, it is observed that JPEGs do not compress well, as the format is, by its very nature, a compressed image. The space savings are usually negligable with magazine scans, therefore in many ways it would be just as useful to upload each page as a separate image (although this makes wiki maintenance more difficult).

While perfect for distributing collections of files, for books and magazines, where the intention is to display a series of images in a set order, these formats are cumbersome. They are, however, easier to edit, with pages easily being replaced or added to if necessary.

Comic book RAR/ZIP

Many comic book readers can interpret the contents of ZIP and RAR files as a "comic book" (a series of images, much like book or magazine scans). If the files inside are named in a certain way, they should display correctly. In order to achieve this though, many earlier readers required RAR and ZIP files to renamed CBR and CBZ, respectively - an extra step likely to confuse those who are not familiar with comic book readers.

This, combined with the requirement for extra software and the aforementioned issues with standard archives, make the format unattractive.

DjVu

DjVu was designed to out-perform PDF in similar tasks, and for many years had a strong following due to being backed by the likes of Archive.org. Support was never widespread, however, and its usage has declined since the mid-2000s. Special software is required to view DjVu files as the format is not as ubiquitous as PDF. It is harder to work with, and so should be avoided.

Frequently asked questions

Why are PDF thumbnails pixelated and small?

The DPI (dots per inch) value of a JPEG has an effect on how the PDF is rendered. While scanners may produce JPEG files listed as 300dpi or 600dpi, this value is best being convereted to 96dpi for optimum rendering on the wiki. You should not have to resize anything - use this tool to change it.

This file demonstrates the effects of having incorrectly set DPI values.

Summary

The Retro wikis use the PDF format because:

  • The format is standardised
  • Usage of the format is widespread, and has been in active service since 1993
  • It is supported by our wiki software
  • PDF support is built in to all major internet browser vendors
  • The tools exist to work with it (JPEG -> PDF, PDF -> JPEG)
  • Compression is deemed negligable when working with JPEG images