Homophonic Collisions: Hold me closer Tony Danza
In this talk, we’ll demonstrate a few practical approaches to exploiting human misunderstanding as a result of homophones (or similar sounding words) to passively collect sensitive information, as well as some redacted real-world examples of sensitive information collected via a catch-all email address for a soundsquatted domain. Domains registered for soundsquatting purposes are likely to be missed by typosquatting and lookalike detection tools like DNSTwist.
We will also cover the vulnerability of different language packaging pipelines (such as pypi) as well as social media handles and how these can be abused. We will also share an inventory of vulnerable and potentially compromised and malicious exploitations of homophonic collisions, which already exist in the wild.
Finally, we intend to release tooling and / or a website for finding vulnerable use cases. We will also release defensive and detection capabilities to help combat this vulnerable vector.
What are homophones and near-homophones Algorithms and modeling for determining near-homophones Finding vulnerable domains, packages, and social media handles Detecting opportunities for exploitation Detecting examples of existing exploitation Pseudo-exploitation for fun and profit science … just fun (examples of real world sensitive data collected from homophonic collisions incidents) Detection methodologies and opportunities Detecting attempts via telemetry observation and patterns Detection using machine learning Protecting the brand; protecting critical assets Demo Tool finding vulnerable homophonic collisions Attaining vulnerable assets Website with an inventory of vulnerable collisions Searching for vulnerabilities
Why is this material different? There has been a lot of research on domain generation algorithms for homoglyphs and typosquatting, with tools like DNSTwist existing for finding such collisions, but very little research on homophonic collisions, taking advantage of human audible misinterpretations. There is little existing research, and for the research that does exist, it is largely academic, with little focus on practical implications. With the advent of voice-assistants and their overlay on top of AI, such as chat GPT, this vector will only expand in significance. We plan to focus on existing vulnerabilities in humans and our systems and show evidence of exactly why it is such a risk, based on real world examples.
Will we release a white paper or tool? We can potentially release a white paper to accompany it as well. We will likely release a tool and website with an inventory of vulnerable assets to search if you are vulnerable.