How to find out the type of a file without an extension (in Windows and Linux)

If you came here from a search and you just need to quickly find out the file extension, then the web page “Online determining the file type without extension” is at your service: https://w-e-b.site/?act=file-type. Thanks to it, you do not need to install any programs, the online service will determine in a second the type of file that you submit, and it will show the results of the scan by the four programs discussed in this article. At the same time, it will display the meta information found in the file – often there is something interesting there.

If you are one of those who want to know how the tools of the specified service work, as well as how to use them on your computer, then continue reading.

If it seems to you that the problem with determining file types without extensions is far-fetched, then this is far from the case! First, if you think about it, it's not an easy task. And you can run into a file without an extension, for example, when decoding a string from Base64 encoding.

Secondly, this article will have a follow up in which the same tools that you learn on this page will be used to:

1) parsing firmware (for example, routers, IP cameras) into its component parts (the first stage of reverse engineering or analyzing the operation of devices to search for vulnerabilities and backdoors)

2) search for file systems on disks and their images (the first stage of forensic IT expertise)

3) search for deleted files

How to determine the data type if the file does not have an extension

If a file does not have an extension, then the only way to determine its type is the contents of this file. You can try adding different extensions to the file name and try to open the programs that match the extension – this option is slow and ineffective.

Certain types of binary files can have the same set of bytes – these bytes can be used to match the file type. And it is this method that is used by programs designed to determine the type of data. Specific bytes, as a rule, are not located at the very beginning of the file, therefore, in addition to the bytes themselves, you need to know the offset from the beginning, where these bytes should be located. Some programs, in addition to bytes for identification, also have a list for checking for false positives.

Such patterns in English are often called magic – it comes from the “magic number” in executable files. These files have a “magic number” stored at a specific location near the beginning of the file, which tells the UNIX operating system that the file is a binary executable file and which of several types. The magic number concept has been applied to other binaries. That is, files of the same type have the same sequence of bytes at a specific location from the beginning of the files.

A file with signatures describing exactly which bytes, at what distance from the beginning of the file are typical for files of one type or another, is usually called a magic file.

To understand the amount of work done when looking for unique bytes that are necessarily present in certain files, look at the magic file for defining file systems https://github.com/file/file/blob/master/magic/Magdir/filesystems

This is just one file from a list of different file types: https://github.com/file/file/tree/master/magic/Magdir

In addition to magic numbers, other techniques can be used, for example, the file program can also use the stat system call to perform tests on filesystems. The type of text files is determined by the strings they contain (for example, it can be PHP code, a file in XML or HTML markup, JSON, and so on).

The file command – instantly detects the type of any file

Linux has a file command with a huge signature base that detects the type of a file very quickly:

To find out what a file is without an extension, run a command like this:

file /PATH/TO/FILE

For example:

file file1

Output:

file1: Microsoft Word 2007+

That is, it is a text file of an office suite Microsoft Office.

You can specify several files at once for verification or use wildcards. For example, the following command will check the types of all files in the current folder:

file *

The file program has options, see the separate article “Instructions for using the file command” for details.

Analogue of the file command for Windows

file is a command line utility for Linux, so Windows users need some sort of alternative. Let's look at several ways to use file on Windows.

1. The file utility in Cygwin

This method, in my opinion, is the simplest. Just download Cygwin and you can use most Linux utilities. For details, including how to specify paths in the file system, see the “How to get started with Linux commands on Windows: Cygwin”.

2.file in WSL

Windows Subsystem for Linux (WSL) is another way to use Linux utilities on Windows. For details on working with WSL, see the reference material “WSL (Windows Subsystem for Linux): Hints, How-Tos, Troubleshooting”.

3. Compiled “file” program for Windows

On the page https://github.com/julian-r/file-windows/releases you can download the compiled files of the file utility (another source is https://github.com/nscaife/file-windows/releases, but there is an older version).

The files differ in architecture (64- and 32-bit) and also in the compiler.

Download a file, for example file_5.38-build49-vs2019-x64.zip.

Unpack the downloaded archive. For example, I put the downloaded files in the C:\Users\MiAl\Downloads\file\ folder.

Open a command prompt, for this press Win+x, select “Windows PowerShell”.

Go to the folder with the program:

cd C:\Users\MiAl\Downloads\file\

To determine the file extension, use a command of the form:

.\file 'PATH:\TO\FILE'

For example:

.\file 'Z:\testfiles\file1'

You can check many files at once, to do this, go to the folder with the file utility and run a command like this:

dir 'PATH:\TO\FILE\*' | foreach { .\file $_ }

For example, I want to check all files in the Z:\testfiles\ folder, then the command is as follows:

dir 'Z:\testfiles\*' | foreach { .\file $_ }

4. TrID is a cross-platform file alternative for Windows and Linux

There are quite a few signatures in the TrID utility and the database is constantly updated with new signatures. The program is cross-platform, detailed instructions for installation and use in Windows and Linux can be found on this page: https://en.kali.tools/?p=1652

An example of analyzing a file – note that several options are displayed with an indication of the percentage probability of each of them:

export LC_ALL=C
trid /mnt/disk_d/Share/testfiles/file1

If you specify several files for identification, then only the most likely variant of the file type will be displayed:

trid /mnt/disk_d/Share/testfiles/*

Installing TrID on Windows

Go to the official website, download the archive with the executable file (mark0.net/download/trid_w32.zip) for Windows, as well as the archive with the signature database (mark0.net/download/triddefs.zip).

Unpack both files into one folder.

Unpack the downloaded archive. For example, I put the downloaded files in the C:\Users\MiAl\Downloads\trid\ folder.

Open a command prompt, for this press Win+x, select “Windows PowerShell”.

Go to the folder with the program:

cd C:\Users\MiAl\Downloads\trid\

To determine the file extension, use a command of the form:

.\trid 'PATH\TO\FILE'

For example:

.\trid Z:\testfiles\file1

Wildcards can be used to scan groups of files, entire folders, and so on.

In addition, using the switch -ae will instruct TrID to add the guessed extensions to the filenames. This come handy, for example, when working with files recovered by data rescue softwares. For example:

trid c:\temp\* -ae

 TrID/32 - File Identifier v2.24 - (C) 2003-16 By M.Pontello          
 Definitions found:  5702
 Analyzing...

 File: c:\temp\FILE0001.CHK
  75.8% (.BAV) The Bat! Antivirus plugin (187530/5/21)

 File: c:\temp\FILE0002.CHK
  77.8% (.OGG) OGG Vorbis Audio (14014/3)

 File: c:\temp\FILE0003.CHK
  86.0% (.DOC) Microsoft Word document (49500/1/4)

 File: c:\temp\FILE0004.CHK
  42.6% (.EXE) UPX compressed Win32 Executable (30569/9/7)

  4 file(s) renamed.

At this point, the files in the c:\temp folder will look like:

  • FILE0001.CHK.bav
  • FILE0002.CHK.ogg
  • FILE0003.CHK.doc
  • FILE0004.CHK.exe

Instead, the switch -ce will just change the file extension to the new one; if the file has no extension, the new one will be added. For example:

  • IAmASoundFile.dat -> IAmASoundFile.wav
  • IAmABitmap -> IAmABitmap.bmp

TrID can get a file list from stdin, with the -@ switch.

So it's possible to work on an entire folder tree, or a particular subset of files, just using the output of some other command through a pipe. Something like:

dir d:\recovered_drive /s /b | trid -ce -@
 Definitions found:  5702
 Analyzing...

 File: d:\recovered_drive\notes
 100.0% (.RTF) Rich Text Format (5000/1)

 File: d:\recovered_drive\temp\FILE0001.CHK                           
  77.8% (.OGG) OGG Vorbis Audio (14014/3)

 ...  

It's possible to tell TrID to show some more information about every match (such as the mime type, who created that definition, how many files were scanned, etc.); and it's also possible to limit the number of results shown.

The switch -v activate the verbose mode, and -r:nn specifies the max number of matches that TrID will display. Default is 5 for normal mode, 2 for verbose, 1 for multi-files analysis.

trid "c:\t\Windows XP Startup.ogg" -v -r:2

 TrID/32 - File Identifier v2.24 - (C) 2003-16 By M.Pontello          

 Collecting data from file: c:\t\Windows XP Startup.ogg
 Definitions found: 5702
 Analyzing...

  77.8% (.OGG) OGG Vorbis audio (14014/3)
          Mime type  : audio/ogg
        Definition   : audio-ogg-vorbis.trid.xml
          Files      : 37
        Author       : Marco Pontello
          E-Mail     : marcopon@nospam@gmail.com
          Home Page  : http://mark0.net

  22.2% (.OGG) OGG stream (generic) (4000/1)
        Definition   : ogg-stream.trid.xml
          Files      : 35
        Author       : Marco Pontello
          E-Mail     : marcopon@nospam@gmail.com
          Home Page  : http://mark0.net

TrID is not updated frequently, but the database is regularly updated with new signatures, so update the database from time to time.

5.fil is another cross-platform alternative to file

The fil utility is written in Go and is cross-platform. But there are so few signatures in the program that, in my opinion, the fil utility is practically useless.

File alternatives

For most needs to determine the type of a file without an extension, the file utility is sufficient, but there are utilities with related functionality that can replace or clarify information from the file command. Each of these programs will be discussed in more detail in the next part, now only a brief overview.

Detect It Easy

Detect It Easy is a cross-platform file type detection tool. There is a variant with a graphical interface as well as a command line interface.

You can find instructions for installing the program on its page https://en.kali.tools/?p=1644.

To analyze the /mnt/disk_d/Share/testfiles/file1 file and show the results in the graphical interface:

die /mnt/disk_d/Share/testfiles/file1

To analyze the /mnt/disk_d/Share/testfiles/file1 file and show the results in the command line interface:

diec /mnt/disk_d/Share/testfiles/file1

Detect It Easy is primarily aimed at analyzing executable files, so its functions are more related to program files, for example, determining the architecture. But there is also support for other binaries.

Binwalk

Binwalk is a firmware analysis program, but it contains a lot of binary file signatures, so it is suitable for determining the file type. The peculiarity of Binwalk is that it is aimed at working with compound files (which usually are firmware), so it can determine the file type even if the file is not at the beginning.

Usage is the same as file, it is enough to specify the path to one or more files:

binwalk /mnt/disk_d/Share/testfiles/file1

Detect It Easy and Binwalk are not so much competitors of the file utility as they are the “last chance” to determine the data type if the file command did not help.

See the continuation in the article “How to analyze and split compound files (firmware, multi partition disk images)”.

Recommended for you:

Leave a Reply

Your email address will not be published. Required fields are marked *