Practical guide: the signing of electronic documents by Bruno Lowagie

Bruno Lowagie is the initial developer of iText, an innovative PDF library that has become an international software publisher. An active member of the ISO and PDF communities, Bruno has written several books on iText, and continues to work with the community to improve PDF functionality. When not working on PDF-related topics, Bruno spends much of his time with his wife and two sons. 
In 2013, Eddy Haerens from Leefdaal paid the eighth bill sent by his service provider. At least that's what he believes. In fact, the invoice was intercepted by an impostor who only changed one thing on the invoice: the provider's bank account number. Eddy paid 30,000 euros, but the legitimate recipient never received the money. When Eddy discovers the trick, it's too late. The bank having followed its written instructions, it could not cancel on the transfer. To top it off, Eddy still had to pay 30,000 euros to his provider.

Document signing, the solution to which problems?

This example illustrates the vulnerabilities inherent in our approach to documents. When we receive a document in paper or digital format, how do we know that the content has not been tampered with? How do you know if the sender of the document is the person he claims to be? And if we receive a signed contract, how to prevent - without appealing to a notary - that the signatory exclaims one day "But, I never signed such a document! "?
The answer to these three questions corresponds to the three "guarantees" that we would like to see included in our documents:
  • Integrity  : we want to be sure that the document has not been modified.
  • Authenticity  : we want to be sure that the author of the document is the person he claims to be.
  • Non-repudiation  : we want to be sure that the author can not deny the fact that he is the author of the document.
These three safeguards are embedded in PDF documents using digital signatures. On the technical side, the operation is quite complicated to explain, but here are the main lines.

Concept n o 1: the cryptographic hash functions

A digital document consists of a sequence of bytes organized so that software can display it on the screen or print it. A cryptographic hash function takes these bytes and reduces them into a digest of a predefined length. It is impossible to reconstruct the original bytes from the digest. But, if we apply the hash function to the same sequence of bytes, we always get the same digest.
Now take the case where documents are sent to multiple recipients. Example: sending quarterly newsletters to all students at a university. For reasons of confidentiality, the university can not put all of these bulletins online on a public server. However, if a student sends his ballot to another institution in which he wishes to continue his studies, this institution should be able to verify that the notes have not been falsified.
The publication by the university of the digest of each newsletter sent would solve the problem. These digests can not be used to retrieve students' notes, but any recipient of an electronic newsletter can generate a digest of the bytes composing the document to compare it with the digest published online. If both agree, the report card was not falsified.

Concept n o 2: Public Key Infrastructure (PKI)

Without going into details, a public key infrastructure (PKI) involves a pair of asymmetric keys. One is the public key, the other is the private key. None of these keys are derived from the other, but if you encrypt data with a key, you can only decrypt them with the other key. The private key must remain private (it is usually stored on a physical device from which it can not be copied). The public key can be shared with anyone.
These keys can serve two things:
  • For encryption  : if I want to share information with a particular person, I can ask him for his public key and encrypt my data with. Nobody can decrypt this data except the owner of the corresponding private key.
  • For digital signing  : If I want to share information with the rest of the world, I can encrypt it with my private key. The rest of the world can use my public key to decrypt this information. For this to work, the document must have been encrypted using my private key.
But asymmetric encryption has a big disadvantage: the more we try to strengthen the encryption, the more the number of bytes explodes. The use of hashing overcomes this disadvantage.

Concept n o 3: Digital Signature

To share a digitally signed document, you must produce three elements:
  • Bytes of the document as is
  • A public certificate issued by a trusted third party. This certificate contains a public key.
  • An encrypted digest of the document
The beneficiary of these three elements can validate the digitally signed document in three steps:
  • Creation of a digest from the bytes of the document (transmitted in 1. »: hash1
  • Decryption of the digest (transmitted in 3) using the public key (transmitted in 2): hash2
  • If hash1 = hash2, then the document is correct.
Do these procedures meet our needs in terms of integrity, authenticity and non-repudiation?

How to trust the information in the public certificate?

If the digest stored in the document corresponds to the digest calculated on the fly from the bytes of the document, integrity is ensured - provided that the digest was deciphered correctly using the public key. To know if the digest has been correctly deciphered, the two digests must match. The owner of the corresponding private key is then authenticated as the author of the signature. The signatory can not then deny having signed the document unless it proves that his private key has been stolen.
However, there remains a point of uncertainty: how to trust the information in the public certificate? How to verify the identity of the owner of the private key? How to verify that the private key has not been revoked (in case the owner reported his flight)?
The Certification Authority (CA) answers all these questions. A Certificate Authority issues only public and private keys to parties whose identity it has carefully verified. The CA also maintains a database of all public certificates issued, along with information on revoked keys.

How to validate a digital signature?

It would be impractical to send separately the public certificate, the encrypted digestand the document. It would also be very difficult to validate a digital signature in several steps. The solution ? The PDF format that stores all validation information (VRI) in the document itself. Validation of the digital signature is done automatically in the PDF viewer.
Validation of the digital signature
Figure 1
The blue part shows the syntax of a PDF document. The pink part is reserved for the digital signature. All bytes of the PDF document are signed except the signature (which is not in the byte range).
which information can be stored inside the digital signature.
Figure 2
The signature includes at least one digest of the signed message and a public certificate with the identity of the signatory. In accordance with good practice, the signature may also include the following information:
  • The complete chain of the certificate that links the CA root certificate to the public certificate
  • Revocation information with certificate validity at a specific point in time
  • A timestamp that irrefutably indicates the time the signature was created
what a well signed PDF document looks like in Adobe Reader
To create digitally signed documents as shown in Figure 3, you need software that performs the steps in this article, and certificates issued by a trusted CA from Adobe.
As a technology partner of AA, iText Software Group provides PDF tools that make work with digital documents smarter. iText pushes the limits of the interactivity of digital documents. The software makes your documents workable and smart, and allows you to control access and changes to documents, digitally sign documents, make digital archiving, and more. Free software / open source since 2000, iText offers the best documented, best performing and richest PDF Open Source PDF engine in Java and C #.

Post a Comment

0 Comments