Search has evolved past identifying information via keywords into what’s now known as semantic search. This method of search not only considers keywords, but also the context of the search and the intent of the searcher. For example, a search in the mid-2000s might have read something like “best doctors in Phoenix,” whereas a search today would read more like “who is the best cardiologist in the greater Phoenix area?” People can now search in a way that’s similar to how they would speak, and that’s all thanks to semantics.
To best understand the way semantic search works, it’s important to understand Google’s filtering algorithm. After a user inputs his or her search, Google tries to understand the search. The understanding phase is where Google focuses on the semantic aspect; instead of identifying keywords, Google executes a process called word-sense disambiguation (WSD). This process dictates context in which the word is used if it has multiple meanings and enables the algorithm to provide the searcher with the proper information. After Google executes WSD, it then retrieves the information, filters and clusters it, ranks the it, and finally presents the right results to the searcher.
Because of WSD, Google can better understand how to rank results pages. Although the specifics of Google’s algorithm are a secret, the company uses a tool called PageRank, which assigns a relevancy score to pages. Scores are rated on the following:
- The length of time a webpage has been in existence
- Both frequency and location of keywords
- The number of pages that link to the pages associated with the search
Google also considers a number of other factors when evaluating word context. These factors include a user’s existing search history, spelling variations, distance between terms, time between previous searches and a user’s location. When creating content, it’s important to present it in a way that’s not just shareable, but also searchable using commonly understood language. Additionally, understanding co-occurrence of words is also helpful for optimizing search. For example, if a user were to search the word “tomato,” words that would also be presented include “slice,” “cook,” “dice” and more.
Semantic search has been evolving since the earliest days of search engine technology. As more information gets added to the internet, search engines continue to develop new methods of filtering through the millions of pages to provide users with the links most relevant to them.
At Digitile, we became very interested in semantic search and language processing because we were sick and tired of using outdated search algorithms in business. If you interact with the average file storage system, you’ll have no doubt noticed that the deeper a file is buried, the harder it is to find. Folders upon folders of documents, PowerPoints and images that get in the way of what we are trying to accomplish. In fact, Digitile’s changed the game on how the workplace finds documents through your Google Chrome browser! It’s built into Gmail and lets you search, find, share, and tag files stored and managed in Gmail and other cloud solutions without ever leaving Gmail.
We have created a tool that utilizes semantic search because Google has trained our brains to ask for what we want, how we want it. The Digitile platform allows users to work like they surf the web. Rather than needing to remember an exact file name or where it is located, Digitile will organize documents and images across all connected cloud-storage platforms, and using semantic search queries, generates the file(s) a user needs in seconds. We’ve ditched folders and allow the user to “speak” to their storage systems using natural language to find files faster. Digitile functions like “Google for work.”
The idea of talking or typing to computers like you would a friend or colleague is a reality. It may not be as advanced as HAL from “2001 A Space Odyssey” yet, but it will be the way we interact with our work machines and it will make life easier. Algorithms that include semantic search and natural language processing will eventually take over as the way we search all things digital, documents, online conversations, images and more.