How to extract, delete and edit metadata in LibreOffice files
This article focuses on the metadata in the LibreOffice files. This office suite is free, its source code is open, it is cross-platform and quite popular. It has good compatibility with MS Office and is generally a very good alternative to MS Office.
If you are concerned with anonymity issues and use, for example, the Tails operating system, then the LibreOffice office suite is installed there. In most Linux operating systems, LibreOffice suite is installed by default, or present in standard repositories and are available for installation. That is, LibreOffice is almost always used on Linux. LibreOffice works great on Windows and is also quite popular on this OS too.
LibreOffice has its own peculiarities regarding metadata, and even if you already know how to remove metadata from Word, LibreOffice has its own specifics. By the way, a similar article on Word: ‘How to view metadata in MS Word files. How to remove and edit Word metadata’.
Metadata in LibreOffice Writer Files
Files of LibreOffice also contain metadata. In this article I will look at Writer files, which is the equivalent of MS Word. LibreOffice Writer files have the .odt extension (regular documents) and the .ott extension (document templates).
In fact, Writer supports many formats:
- ODF (flat XML) text document: .fodt
- Unifield Office Format Text: .uot
- Different versions of Word
- Other
But the native and most frequently used is .odt - we’ll focus on it (if you have interest, you can test how well mat and mat2 programs search and delete metadata in other formats, as well as how much metadata is contained in other office file formats).
To make it a bit more interesting, I added a digital signature to the test document and tried to add a macro.
Let's start by checking the found metadata using the mat command:
mat -d file2.odt
Example output:
[+] File file2.odt : Harmful metadata found: Thumbnails/thumbnail.png's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} content.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Configurations2/popupmenu/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} styles.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} layout-cache's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} editing-duration: PT15M15S generator: LibreOffice/6.2.5.2$Linux_X86_64 LibreOffice_project/20$Build-2 Configurations2/images/Bitmaps/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} editing-cycles: 5 settings.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} creation-date: 2019-07-18T13:45:13.664694066 META-INF/documentsignatures.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} Configurations2/toolpanel/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} meta.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Configurations2/toolbar/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} META-INF/manifest.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} print-date: 2019-07-18T16:18:33.320774451 date: 2019-07-18T17:02:01.639751003 Basic/Standard/Test_Macro.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Configurations2/statusbar/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} manifest.rdf's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Scripts/javascript/Python_Macro/parcel-descriptor.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Configurations2/progressbar/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} Configurations2/menubar/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} Configurations2/floater/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'} mimetype's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Basic/script-lc.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Basic/Standard/script-lb.xml's zipinfo: {'modified': (2019, 7, 18, 14, 2, 18), 'system': 'unknown'} Configurations2/accelerator/'s zipinfo: {'modified': (2019, 7, 18, 14, 2, 36), 'system': 'unknown'}
There is a lot of information, but it is shown in an uncomfortable form for perception, we will try to understand it:
- The editing-duration: PT15M15S line, apparently, shows the editing time.
- The LibreOffice/6.2.5.2$Linux_X86_64 LibreOffice_project/20$Build-2 line reveals the program and the operating system used to create the document.
- Apparently, editing-cycles: 5 is the number of revision.
- The creation date string: creation-date: 2019-07-18T13:45:13.664694066
- The print date string: print-date: 2019-07-18T16:18:33.320774451
- And just the date: date: 2019-07-18T17:02:01.639751003
- There is some information about macros:
- Basic/Standard/Test_Macro.xml
- Scripts/javascript/Python_Macro/parcel-descriptor.xml
- And also a lot of files with numbers – apparently, with dates of change.
There is a lot of data extracted, but nothing is said about the digital signature.
Let's look at the output of the mat2 command:
mat2 -s file2.odt
About the same information is shown:
Metadata from LibreOffice template files
The mat program can work with .ott files and displays meta information from Writer templates.
mat -d file2.ott
But mat2:
mat2 -s file2.ott
failed
[-] file2.ott's format (application/vnd.oasis.opendocument.text-template) is not supported
But the trick with the change of the file extension helped – it turned out to be enough to change the .ott extension to .odt so that mat2 showed the metadata of this file (similar to what was done in the article on pro Word files).
Image metadata in LibreOffice files
When adding pictures to MS Word, this program compresses the images and the metadata of the images themselves are lost. LibreOffice Writer saves the original image! A very important conclusion follows from this: if the images that are inserted into the LibreOffice document contain, for example, GPS coordinates, camera information and other metadata, then they can be extracted!
I created a text file in Writer (.odt) and added an image there.
The mat program found both a photo and information about it:
mat -d file.odt
The mat2 program showed data in a more readable form:
mat2 -s file.odt
LibreOffice file structure
Office files are an archive – you can change the file extension to .zip and unpack it.
Inside there will be mostly .xml files:
Images are in the Pictures/ folder.
The text of the document is stored in the content.xml file.
Electronic signature data is placed in the META-INF/documentsignatures.xml file.
Metadata such as creation time, revision time, number of revisions, document statistics, the program that created the document, and some others are stored in the meta.xml file.
If desired, the data can be edited manually and then re-assembled into a LibreOffice file – by analogy with MS Word files as shown in this article.
How to clear the LibreOffice file metadata
The tools already familiar to us perfectly cope with clearing metadata – all you need to do is to run them without options and specify the path to the file.
To clean metadata from a file using mat:
mat file.odt
To clean with mat2:
mat2 file.odt
Please note that mat2 does NOT clear the specified file – it will leave the file intact, but will create a new file, in the name will be added “cleaned” string just before the file extension.
It is also important to note that the mat2 program could not clear the file containing the macros, and the mat program did this.
All internal files, including images, also remove all metadata! That is, it is not necessary to clear its metadata before adding a photo to a document if you plan to do a cleaning for the entire document.
Conclusion
If you don’t have a task to preserve your anonymity, then perhaps you shouldn’t worry about the metadata – they can’t hurt you.
For all other cases, the mat and mat2 programs will help you. Despite the similarity of the names, these tools were created by different authors and the second is not a continuation of the first – they are just two different programs.
Related articles:
- How to view metadata in MS Word files. How to remove and edit Word metadata (92.1%)
- Program for removing sensitive information from a document (61.6%)
- Best Kali Linux tools in WSL (Windows Subsystem for Linux) (Part 2) (59.9%)
- Guide to GPS Metadata in Photos (58.2%)
- How to see and change timestamps in Linux. How to perform timestamps-based searching (56.3%)
- How to find out the exact model of a router (wireless access point) (RANDOM - 0.1%)
A good article 😀 thanks you from CL