Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized mp3 file #75

Closed
parro-it opened this issue Jan 27, 2017 · 7 comments · Fixed by #101
Closed

Unrecognized mp3 file #75

parro-it opened this issue Jan 27, 2017 · 7 comments · Fixed by #101

Comments

@parro-it
Copy link

I have some mp3 file not recognized by file-type:

❯ file /usr/local/Wowza/content/20151027_Consiglio_01.mp3
/usr/local/Wowza/content/20151027_Consiglio_01.mp3: MPEG ADTS, layer III,  v2.5,  64 kbps, 11.025 kHz, Monaural

first bytes in file are 0xffe380, while file-type check for 0x494433 or else 0xfffb.

https://en.wikipedia.org/wiki/List_of_file_signatures only list the two checks actually done by file-type, but the magic header file used by posix file command seems to list a more complex check:

# MP3, M1A
0       beshort&0xFFFE  0xFFFA         MPEG ADTS, layer III, v1
# rates
>2      byte&0xF0       0x10           \b,  32 kBits
>2      byte&0xF0       0x20           \b,  40 kBits
>2      byte&0xF0       0x30           \b,  48 kBits
>2      byte&0xF0       0x40           \b,  56 kBits
>2      byte&0xF0       0x50           \b,  64 kBits
>2      byte&0xF0       0x60           \b,  80 kBits
>2      byte&0xF0       0x70           \b,  96 kBits
>2      byte&0xF0       0x80           \b, 112 kBits
>2      byte&0xF0       0x90           \b, 128 kBits
>2      byte&0xF0       0xA0           \b, 160 kBits
>2      byte&0xF0       0xB0           \b, 192 kBits
>2      byte&0xF0       0xC0           \b, 224 kBits
>2      byte&0xF0       0xD0           \b, 256 kBits
>2      byte&0xF0       0xE0           \b, 320 kBits
# timing
>2      byte&0x0C       0x00           \b, 44.1 kHz
>2      byte&0x0C       0x04           \b, 48 kHz
>2      byte&0x0C       0x08           \b, 32 kHz
# channels/options
>3      byte&0xC0       0x00           \b, Stereo
>3      byte&0xC0       0x40           \b, JntStereo
>3      byte&0xC0       0x80           \b, 2x Monaural
>3      byte&0xC0       0xC0           \b, Monaural
#>1     byte            ^0x01          \b, Data Verify
#>2     byte            &0x02          \b, Packet Pad
#>2     byte            &0x01          \b, Custom Flag
#>3     byte            &0x08          \b, Copyrighted
#>3     byte            &0x04          \b, Original Source
#>3     byte&0x03       1              \b, NR: 50/15 ms
#>3     byte&0x03       3              \b, NR: CCIT J.17
@mifi
Copy link
Contributor

mifi commented Jan 27, 2017

Actually libmagic has a lot of checks for identifying different mp3's:
https://github.com/threatstack/libmagic/blob/master/magic/Magdir/animation

Your file would match this rule:

# MP3, M25A
0       beshort&0xFFFE  0xFFE2         MPEG ADTS, layer III,  v2.5
!:mime	audio/mpeg

We could implement the logic from libmagic. However I don't know how we should map to file extension, because MP1, MP2 and MP3 all just resolve to audio/mpeg

@parro-it
Copy link
Author

We could implement the logic from libmagic.

I see your call for libmagic reimplementation in js, I would be glad to help with the effort... I think
about 3 possibility to start:

  1. try to compile libmagic to js using emscripten

  2. do a native binding module

  3. reimplement all the stuff

  4. or 2) would probably be the better choice, but 3) is more fun for sure 😄

However I don't know how we should map to file extension, because MP1, MP2 and MP3 all just resolve to audio/mpeg

I've no idea... for my use case I need only the mime type. magic header comment seems to speak only
about MP3?

@mifi
Copy link
Contributor

mifi commented Jan 27, 2017

I was actually thinking about just implementing the mp3 stuff from libmagic, not the whole thing.

We made an issue about implementing everything (#68). And I actually tried to use a couple hours to implement the instruction set from libmagic. I got some of them right, but I realized that they have a quite big instruction set, some of which are not trivial to implement, so I gave up. Also it got quite slow.

But the logic required for parsing audio/mpeg is quite trivial to implement, PRs are welcome :)

@sindresorhus
Copy link
Owner

Also libmagic covers a lot of obscure and ancient stuff we no longer care about. Would be very bloaty to compile it with emscripten. I think a better approach would be to find a way to precompile a selection of the rules into some kind of generated JS.

@parro-it
Copy link
Author

I think a better approach would be to find a way to precompile a selection of the rules into some kind of generated JS.

The format of the rules is well explained in the man page, so this should be possible. Only, as @midif said, there are lot of stuff to implement... But it could be done incrementally at least

@s-ol
Copy link

s-ol commented Aug 7, 2017

I also have a mono mp3 file with 0xFFFA magic. Here's a PR for it in another project: h2non/filetype#11

@karlhiramoto
Copy link
Contributor

In PR #101 It should fix this case as well as other cases I encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants