Frequently Asked Questions

Yes. It can perform real-time over-the-air (OTA) recognitions, even in noisy environments.

Covers are treated like any other original recording, so it can recognize a cover if it has been previously fingerprinted. In general, it cannot recognize any cover from just one single fingerprinted recording, even though it may be able to do so if the covers are perceptually very similar to the original.

It can recognize specific instances of speech. For example a specific radio program or TV show. It cannot categorize sounds as "speech". Audioneex is not a sound classifier.

Audioneex is based on audio fingerprinting technology, meaning it can recognize specific content, not generic audio categories. It does not perform classification and/or segmentation of sounds.

It can recognize generic sounds provided that there is very little variation each time they occur. For example, if you are designing a wearable device for deaf people that must recognize alarm sounds then that is fine since these sounds are always the same. But if you want to recognize broad classes of sounds such as "dog barks" or "door slams" then you are probably after a sound classifier, which is a different technology than audio fingerprinting.

The engine provides an estimate of the point in time where the recognition occurs within a recording, so it can be used to trigger external routines based on these time points.

No, audio content can be identified from short snippets of just a few seconds. However, the whole recording must be fingerprinted and stored in order for it to be fully recognized.

This primarily depends on how much the audio has been distorted compared to the fingerprinted recording, and also on the nature of the audio. Typically, for audio that's only moderately distorted recognitions should happen within 5 seconds. For clean non-distorted audio it can be as fast as 1-2 seconds. It takes longer if the audio is substantially distorted, such as in OTA applications with noisy environments. Also, well structured audio such as speech, music, etc. can be recognized faster than noise-like sounds.

It has been extensively tested on the following platforms and architectures

  • Linux (x32/x64)
  • Windows (x86/x64)
  • Android/Linux Embedded (ARMv7-A, ARMv8-A, x86/x86_64, MIPS 32/64)
  • iOS (ARMv7, ARMv8)
However it may also run on others. We have not tested on all possible combinations available on the market.

No, we do not provide any kind of audio database, nor there is a public web API.

Audioneex is open source, so you should first try porting it yourself. If you are not successful we can help making the port, provided that 1) X does not lack any hardware and/or software functionality needed by the engine to work; and 2) we have access to hardware and all required development tools for X. You must provide us with all the needed resources if we do not have access to them and it will be considered as an integration service, thus involving development fees.

NEWS: The full implementation of our commercial ACR engine is now open source and available on GitHub.


© 2014 Audioneex.com. All rights reserved.