Dot Peekser Logo

#quick start#introduction#setup

Introducing Sitecore Search

by Fabian Holtermann on

Introduction

In today's digital age, users have become accustomed to searching for results rather than navigating a website to find what they need. This is where Sitecore Search SaaS product comes into play. With Sitecore Search, you can easily set up a search feature for your website within a couple of hours, making it easier for users to find what they are looking for. In this article, we will guide you through the steps to set up a search using Sitecore Search SaaS.

Step 1: Setting Up the Web Crawler

To enable search functionality, you need to have content that can be searched. Sitecore Search provides an advanced web crawler to index the content of your website. The first step is to go to the administration/source section and create a new source. Give it a name, description, and choose a connector (Web Crawler (Advanced)). This will take you to the detail view of the newly created source, where you can configure the crawler based on your needs.

administration.png

Step 2: Define a Scope

Configuring which sites should be crawled and setting boundaries for the crawler can be achieved in the "Web Crawler Settings" section. Start by defining the allowed domains. For example, dotpeekser.de. You can also enable "Render Javascript" if your website is built on Next.js.

Step 3: Configure a Trigger

Triggers are necessary to notify Sitecore Search when any content has been changed. For now, we will use the sitemap of our page as a trigger.

To configure this, select "Sitemap" as the trigger type and fill in the URL to your sitemap. For example, if your website is https://dotpeekser.de, the URL to your sitemap might be https://dotpeekser.de/sitemap.xml.

Step 4: Define a Document Extractor

Document extractors define the content that is crawled. Start by giving it a name and selecting its extractor type. We will choose the type "Javascript" and use the Cheerio syntax. You can use a simple example to extract the meta tags of your site, as shown in the code snippet above.

Once you have filled in the fields of your content definition, you can see the available fields in the administration/domain settings/attributes section. With these configurations, you can send the crawler off and it will run for the first time, indexing all available content.

function extract(request, response) {
  $ = response.body;
  return [{
    'description': $('meta[property="og:description"]').attr('content') || $('meta[name="description"]').attr('content') || $('p').text(),
    'name': $('meta[property="og:title"]').attr('content') || $('title').text(),
    'type': $('meta[property="og:type"]').attr('content') || 'content',
    'url': $('meta[property="og:url"]').attr('content') || $('link').attr('href') || 'empty_link'
  }]
}

Conclusion

By completing the steps outlined above, we have defined what content will be indexed and how it will be crawled. In the next part, we will examine the indexed content and explore how we can tailor the search results to meet our specific needs.

Related Posts

Fabian Holtermann

For over 14 years, I've been dedicated to crafting web experiences that make a real impact. I thrive on continuously improving my skills and embracing new challenges that come my way.