SENSAI

About openSMILE

openSMILE is audEERING’s much-cited open-source toolkit for real-time audio feature extraction and classification. Over the last decade, it has evolved into the most widely-used tool for next-generation audio analysis applications such as emotion recognition from speech. Within the research community, openSMILE has started to attract attention in 2009 when it was used as a baseline speech feature extractor for the Interspeech Emotion Challenge. Since then, the core technology of openSMILE has been continuously improved and extended. It has been applied in several EC-funded projects (e.g., SEMAINE, ASC-Inclusion) and in numerous other international reserarch challenges such as the Interspeech Paralinguistic Challenges 2010 to 2016, and the AVEC Challenges 2011 to 2016. openSMILE was acknowledged two times in the context of the ACM Multimedia open-source award.

Since many years, our engineers at audEERING are enriching the tool with new features and machine learning algorithms, reflecting the state-of-the-art in affective computing. Based on this powerful software toolkit, we offer superb commercial services and applications. To learn more, visit our openSMILE page.

opensmile

sensAI Modules

Our sensAI technology combines proprietary algorithms for speech, music, and general audio analysis with a well-established core module, our open-source audio analyzer openSMILE. sensAI comes in multiple flavours – please click on the components of the chart below to learn more!


sensai-base

sensAI-base is audEERING’s basic audio analysis engine which identifies speech units, music, and voice gender. Our sensAI products can deal with realistic, real-life acoustic conditions. sensAI-base includes our intelligent robust voice activity detection technology that locates voice signals even in highly noisy environments, such as background music or street noise.

sensAI-enhance is a smart background noise remover for speech enhancement. It can be applied for separating speech from noise and for enhancing the clarity of speech by intelligent source separation technology.

sensai-enhace

sensai-music

sensAI-music can be applied for several music processing and classification tasks such as singing voice detection or beat, rhythm, mood, and genre analysis. It can extract chorus segments, identify key and chords, detect tempo, meter, dance-style, and robustly track beats in a song, including identification of downbeats and percussive beats. Our detector for singing voice segments allows you to automatically label and extract vocal segments from any music recording.

sensAI-emotion includes audEERING’s world-leading emotion and affective state identification from the voice. Applied to the extracted voice parts, sensAI-emotion detects various kinds of speaker states and traits such as basic emotion categories like joy, anger, fear, etc. or continuous values for “emotional dimensions” like valence, arousal, etc. Automated paralinguistic speech analysis via sensAI-emotion can be used to improve efficiency of several business fields, such as quality control in call centres, targeted advertising, or increasing the usability of intelligent virtual agents and humanoid robots. sensAI-emotion combines world leading research results with systematic, high-quality engineering, machine learning, and software development.

sensai-emotion

opensmile

openSMILE (Speech and Music Interpretation by Large audio-Space Extraction) is the speech analysis toolkit providing a technically solid and scientifically well evaluated core for audEERING’s audio and speech analysis technology. The openSMILE feature extration tool enables you to extract large audio feature spaces in realtime and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features. openSMILE is a cross-platform tool (running on Windows, Linux, Mac, Android) that offers fast and efficient audio processing in real-time, reusability of components, plugin support, as well as multi-threading for parallel feature extraction.