Apache Solr - Search Engine Basics



A Search Engine refers to a huge database of Internet resources such as webpages, newsgroups, programs, images, etc. It helps to locate information on the World Wide Web.

Users can search for information by passing queries into the Search Engine in the form of keywords or phrases. The Search Engine then searches in its database and returns relevant links to the user.

Google Search

Search Engine Components

Generally, there are three basic components of a search engine as listed below −

  • Web Crawler − Web crawlers are also known as spiders or bots. It is a software component that traverses the web to gather information.

  • Database − All the information on the Web is stored in databases. They contain a huge volume of web resources.

  • Search Interfaces − This component is an interface between the user and the database. It helps the user to search through the database.

How do Search Engines Work?

Any search application is required to perform some or all of the following operations.

Step Title Description

1

Acquire Raw Content

The very first step of any search application is to collect the target contents on which search is to be conducted.

2

Build the document

The next step is to build the document(s) from the raw contents which the search application can understand and interpret easily.

3

Analyze the document

Before indexing can start, the document is to be analyzed.

4

Indexing the document

Once the documents are built and analyzed, the next step is to index them so that this document can be retrieved based on certain keys, instead of the whole contents of the document.

Indexing is similar to the indexes that we have at the end of a book where common words are shown with their page numbers so that these words can be tracked quickly, instead of searching the complete book.

5

User Interface for Search

Once a database of indexes is ready, then the application can perform search operations. To help the user make a search, the application must provide a user interface where the user can enter text and initiate the search process

6

Build Query

Once the user makes a request to search a text, the application should prepare a query object using that text, which can then be used to inquire the index database to get relevant details.

7

Search Query

Using the query object, the index database is checked to get the relevant details and the content documents.

8

Render Results

Once the required result is received, the application should decide how to display the results to the user using its User Interface.

Take a look at the following illustration. It shows an overall view of how Search Engines function.

Search Engine

Apart from these basic operations, search applications can also provide administration-user interface to help the administrators control the level of search based on the user profiles. Analytics of search result is another important and advanced aspect of any search application.

Advertisements