<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TemplatesFactory.Net Articles &#187; Search Engines</title>
	<atom:link href="http://www.templatesfactory.net/articles/category/search_engines/feed" rel="self" type="application/rss+xml" />
	<link>http://www.templatesfactory.net/articles</link>
	<description>TemplatesFactory.Net Articles!</description>
	<lastBuildDate>Fri, 16 Jul 2010 13:26:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
		<item>
		<title>The Anatomy of an Automated Search Engine</title>
		<link>http://www.templatesfactory.net/articles/the-anatomy-of-an-automated-search-engine.html</link>
		<comments>http://www.templatesfactory.net/articles/the-anatomy-of-an-automated-search-engine.html#comments</comments>
		<pubDate>Mon, 23 Jun 2008 06:20:43 +0000</pubDate>
		<dc:creator>Hasan</dc:creator>
				<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[automated search engine]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://www.templatesfactory.net/articles/?p=86</guid>
		<description><![CDATA[The creation of a large-scale search engine is an onerous task and one that involves huge challenges. A perfect automated search engine is one that crawls the Web quickly to gather all documents regularly to keep them up-to-date. Plenty of storage space is required to efficiently store indices or the documents themselves. The magnitude of [...]]]></description>
			<content:encoded><![CDATA[<p>The creation of a large-scale search engine is an onerous task and one that involves huge challenges.</p>
<p>A perfect automated search engine is one that crawls the Web quickly to gather all documents regularly to keep them up-to-date. Plenty of storage space is required to efficiently store indices or the documents themselves.</p>
<p>The magnitude of data that has to be handled on the ever-growing Internet involves billions of queries daily.  The indexing system of a search engine should be capable of processing huge amounts of data by using its space most efficiently and handling thousands of queries per second.  The best navigation experience should be provided to the users in the form of finding almost anything on the Web, excluding junk results, with the use of high precision tools. </p>
<p>The anatomy of a search engine includes major applications such as those that allow for crawling the Web, indexing, and searching.</p>
<p><strong>Web Crawling</strong></p>
<p>Search engines today depend on spiders or robots (special software) designed to continuously search the Web to find new pages.</p>
<p>Web crawling is the most important aspect of a search engine and is the also most challenging.  It involves interaction with thousands of Web servers and name servers.  It is performed by many fast, distributed crawlers.  They keep getting information regarding lists of URLs they need to crawl and store from a URL server.  The crawlers start their travel with the most used servers and highly popular pages. Each crawler keeps hundreds of connections open at one time in order to retrieve Web pages quickly.  The crawler has to look up the DNS, connect to the host, send a request and receive a response. It does not rank the Web pages, but retrieves copies of all them and stores them in a repository by compressing them. They&#8217;re later indexed and ranked based on different criteria.  Everything from the visible text, images, alt tags, other non-HTML content, word processor documents, and more is indexed.</p>
<p>Crawlers usually visit the same Web pages repeatedly to ensure the site is a stable one and that the pages are being updated frequently. If a certain Web page is not functioning at some point, the crawlers are usually programmed to go back later to try again. However, if it is found that the page is either down continuously or not being updated frequently, they stay away for longer periods of time or index it slowly.</p>
<p>Crawlers also have the capability of following all the links found on Web pages, which they can then visit either right away or later.</p>
<p><strong>Web Indexing</strong></p>
<p>Every result that is found by the search engine spider is sent for indexing, to ensure speed in finding relevant documents for a search query.  Indexing is performed by the indexer or the catalog and the sorter.  This is like a huge book that contains a copy of all the Web pages that the spider finds. When the Web page changes, then this book is also updated automatically.</p>
<p>The indexer does a variety of jobs, such as reading the repository, decompressing the compressed documents, and parsing them.  Basically, the function of the indexer is to allow information to be found as easily and as quickly as possible.</p>
<p>The repository where the crawler stores the Web page details contains the complete HTML of every single Web page.  All the documents are stored in the repository, and each and every Web page that is available on the Web is given a unique doc ID number, which is assigned whenever a new URL is collected from a Web page.</p>
<p>The indexer extracts all the information from each and every document and stores it in a database.  All high-quality search engines index each and every word in documents and give them a unique word ID.  Then the word occurrences, which some search engines call “hits,” are checked. All of the words are recorded, including their placement in the document, their font size and capitalization.</p>
<p><strong>Parsing</strong></p>
<p>The indexer also parses out all the links in each and every Web page and stores their information separately in another file, including where the links are coming from and pointing to, as well as the text of the link.  After the parsing is done, the indexer segregates these hits. It performs the initial sorting, thus creating a forward index that is partially sorted.</p>
<p>Then the file containing the links is read and converts the relative URLs into absolute URLs and turns them into unique doc IDs. It also enters the anchor text associated with the doc IDs into the forward index. Then a database of links is created, which includes pairs of doc IDs.  This database is used to compute the page ranks of all the documents.</p>
<p><strong>Sorting</strong></p>
<p>The job of the sorter is to take the forward indexes, which are sorted by the doc ID and sort them again by word Id to generate the inverted index.  This process is done one index at a time and does not require too much storage capacity.  Parallel sorting processes are completed using multiple sorters.  Since these indexes do not fit into the main memory, the sorter subdivides them into smaller groups based on the word ID and doc ID and then loads each group into the memory, sorts it, and writes its content.</p>
<p><strong>Page Rank</strong></p>
<p>Page rank in search engine results is thought of as the value of a Web page based on user behavior.  This is usually determined by a search engine based on the number of visits to a Web page or group of pages as well as the number of pages pointing to a Web page and even the page rank of the pages pointing to it.  Page rank has other extensions too and is weighed by the link structure of the Web.</p>
<p><strong>Searching</strong></p>
<p>The final step is searching, which means providing the best quality search results for queries. The query is first parsed, then the words in the query are converted to word IDs. A search is done for information in the doc list for every word, the doc lists are scanned until the words in the search query are found, the page rank of those documents is computed, and then all the documents that matched by page rank are sorted and results are returned to the user.</p>
<p>The major issue that users face with some search engines is the poor quality of the results returned by search engines, which consume a lot of time. It can be frustrating when they can&#8217;t find the information they&#8217;re looking for. </p>
<p>A high quality search engine returns quality and relevant results.  Besides the quality of the results, it has to be designed to be able to scale to the growing size of the Web, by using storage efficiently.  A search engine has to ensure that the huge number of documents on the Web can be crawled, indexed, and searched with little cost.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.templatesfactory.net/articles/the-anatomy-of-an-automated-search-engine.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google To Boost Web Applications!</title>
		<link>http://www.templatesfactory.net/articles/google-to-boost-web-applications.html</link>
		<comments>http://www.templatesfactory.net/articles/google-to-boost-web-applications.html#comments</comments>
		<pubDate>Sun, 06 Apr 2008 09:54:11 +0000</pubDate>
		<dc:creator>Hasan</dc:creator>
				<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://www.templatesfactory.net/articles/?p=84</guid>
		<description><![CDATA[Online maps are extremely popular and millions of people use them every day to either find local businesses, to obtain driving directions, to see high-definition aerial images of places or even to check real-time traffic information. Google, Yahoo, Microsoft and AOL compete with each other to improve their online mapping services, which have become an [...]]]></description>
			<content:encoded><![CDATA[<p>Online maps are extremely popular and millions of people use them every day to either find local businesses, to obtain driving directions, to see high-definition aerial images of places or even to check real-time traffic information.</p>
<p>Google, Yahoo, Microsoft and AOL compete with each other to improve their online mapping services, which have become an important part of local search websites.</p>
<p>Google is the unbeaten leader in web computing software delivered over the internet and literally runs inside a browser.  However, many browser applications cannot do all those things that the powerful PC-based software can. Google has been trying to close that gap and finally achieved the fruits of its efforts.</p>
<p>Google said that third party developers can now use the programming interfaces to Google Earth, which is their 3-D visualization software.  This will enable developers to embed Google Earth on websites.</p>
<p>Google Maps is currently being used by thousands of websites that have created applications, to be able to do various things like pointing the place where a crime has taken place or showing the various apartment rentals in various cities and even showing the path of airplanes in flight. These sites will now be able to improve on those applications with better visualization software from Google Earth.  Developers will also be able to make use of Google Earth’s 3-D imaging, to create new applications to run on their sites.  These applications will be embedded in the websites and will be accessible through a browser, and they will work even if users have not installed Google Earth on their computers.</p>
<p>However, users who want the full spectrum of features and data that is available on Google Earth, will still have to download the software onto their computers.</p>
<p>Google also says that it is possible to overlay content over the stars, planets and galaxies by using the Sky mode toggle, which lets people build 3-D Google Sky mashups. Google 3-D buildings can also be created with just one line of JavaScript as well as get KML data from the web, which is the file format that is used to display geographic data in a browser.</p>
<p>Currently, this plug-in only works with Windows Vista or XP; however, support for the other operating systems is being planned for the future.</p>
<p>According to Google, they introduced a set of software tools called Gears, a year ago, which help developers in building browser applications, which run like PC-based software.  These tools were used by MySpace for revamping its mail service.  This provides MySpace users the ability to easily sort and search all their mail messages, without having to click on each and every page to find what they are looking for.</p>
<p>The use of Gears by MySpace is the largest ever attempt to use these tools and it is most successful.  Google believes that this will encourage other companies to use Gears to either improve their web applications or even create new applications.</p>
<p>Even Google is planning to use the software tools to make Gmail available to users even when they are not connected to the internet.  Google is not giving this project a deadline; however, they do hope that it happens this year.</p>
<p>With Google taking giant strides in the web applications department, the future certainly holds many more exciting improvements for webmasters.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.templatesfactory.net/articles/google-to-boost-web-applications.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Need Ideas For Your Website? Your Competition Is Your Best Tool!</title>
		<link>http://www.templatesfactory.net/articles/need-ideas-for-your-website-your-competition-is-your-best-tool.html</link>
		<comments>http://www.templatesfactory.net/articles/need-ideas-for-your-website-your-competition-is-your-best-tool.html#comments</comments>
		<pubDate>Mon, 18 Feb 2008 07:15:25 +0000</pubDate>
		<dc:creator>Hasan</dc:creator>
				<category><![CDATA[Internet Marketing]]></category>
		<category><![CDATA[Search Engines]]></category>

		<guid isPermaLink="false">http://www.templatesfactory.net/articles/?p=82</guid>
		<description><![CDATA[If you are looking to start an internet business, the first thing that has to be done is choosing a niche market. We know there is plenty of competition out there for every niche. But not everybody offers the same as you do. The first thing that has to be done is to get a [...]]]></description>
			<content:encoded><![CDATA[<p>If you are looking to start an internet business, <strong>the first thing that has to be done is choosing a niche market.</strong> We know there is plenty of competition out there for every niche.  But not everybody offers the same as you do.</p>
<p>The first thing that has to be done is to get a good idea on whether to pursue a niche market or not.  This entails keyword research, which includes coming up with keywords that people are searching for and finding out how much competition there is for that particular keyword.</p>
<p>Initially, it is no good going for generic keywords that have a lot of competition.  For example, if you type “Golf,” it is a generic term with millions of sites.  If you target your keywords better, such as “Golf balls” or “Travel Golf,” you will find less number of searches but you are aiming at a particular audience in the niche. Remember! It is always better to be a big fish in a small pond than a small fish in a big pond.</p>
<p>After you have chosen your niche, choose a few more keywords specific to your niche.  Now time to check on your competition.  What are the keywords and phrases they are using?  Right click on “view source” and look at the META description tag (you will find it at the top.)  Are these keyphrases helpful to you?  Narrow down your list. Search on Google with the keyphrases you are planning on using and see if any similar sites are coming up.  If they are, you are on the right track.  Some of these keyphrases will be used on your site in the content and some will be used as links.</p>
<p>To do well with your website, it is important to be high up in the search engine rankings.  Anyone who tells you otherwise is wrong.  How many times have you gone to the 3rd page of Google search results when looking for something?  Most people click on the first few results only.</p>
<p>The question you should be asking is, “What have the other sites done to reach the top?”</p>
<p>Start at the very beginning and take a look at the design of their sites, including colors that have been used and the template.  The idea is to see how the site layout is and do one better than them.</p>
<p>But that’s where the importance of looks end.  You would find that many an ugly site is at the very top.  There could be several reasons for this, such as great information or good back links.  Look at the content the other sites are providing and ensure you do better.</p>
<p>Now that we have our keywords ready and we have created the design of our site, the next step is to bring traffic to the site.  A great looking site without traffic is no use?</p>
<p>This can be achieved through on-page optimization and off-page optimization.  On-page optimization includes incorporating keywords in the text, title, meta tags and headings.   If you notice in the source, some of the top sites do not do this and you can rest assured, your site will fare better.</p>
<p>Off-page optimization makes a world of difference between a top page rank and a low ranking page.</p>
<p>One method is getting other websites in your niche to link to your site.  The more links there are, the better the ranking.  It is not just the number of links, but the page rank of those sites linking to you is very important.</p>
<p>Take a look at the top ten sites in the same niche.  Try and find out the techniques they used.  Make a list of these sites.</p>
<p>Now take one of those top sites and type link:http://www.sitename.com/.  Instead of ‘sitename,’ you should type the actual name of the site. With that, you will be able to see the sites that are linking to your competitor’s site.</p>
<p>You will have to find out the page ranks of all these sites.  You can download the Google tool bar for this.  If the page ranks of the sites linking back to your competitor’s site are not too high, you have a chance of being up there.</p>
<p>I know it all sounds complicated.  But when did anyone say that being the top site is easy?  Going through the whole thing a few times will register better.</p>
<p>In simple words, a complete website can be built along with great back links, just by observing what the competition is doing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.templatesfactory.net/articles/need-ideas-for-your-website-your-competition-is-your-best-tool.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Google</title>
		<link>http://www.templatesfactory.net/articles/google.html</link>
		<comments>http://www.templatesfactory.net/articles/google.html#comments</comments>
		<pubDate>Fri, 20 Apr 2007 06:55:51 +0000</pubDate>
		<dc:creator>Hasan</dc:creator>
				<category><![CDATA[Search Engines]]></category>

		<guid isPermaLink="false">http://74.53.81.218/articles/google.html</guid>
		<description><![CDATA[Google has quickly become the search engine of choice for a vast number of internet browsers, requiring web developers to focus heavily on optimizing page visibility for Google’s crawlers. Ensuring quick access to every file on a site can mean more pages get indexed, allowing for better keyword results. Many developers focus on directory structure [...]]]></description>
			<content:encoded><![CDATA[<p>Google has quickly become the search engine of choice for a vast number of internet browsers, requiring web developers to focus heavily on optimizing page visibility for Google’s crawlers.  Ensuring quick access to every file on a site can mean more pages get indexed, allowing for better keyword results.  Many developers focus on directory structure to enhance results, a practice that can make most sites, especially those containing hundreds or thousands of pages, easy to manage.  Still, a directory must be more than a random organization of folders and files.  Good directories improve indexing speed and provide an organizational structure to the site.  Consider the following examples when organizing your site. </p>
<p> A news website may run as many as fifty new pages in a day.  Certainly the site wants to maintain readership by allowing access to archived stories, even by having them show up in major search engines such as Google.  While storing all of the files in the root directory may get them “crawled,” or, indexed by the search engine, they have no structure.  As a result, it may be difficult for the webmaster, and subsequently the search engine, to keep close track of which pages have been indexed, and which require more links.  The site may also suffer from congestion as browsers dig through thousands of files within the root directory to locate the single file that may be needed. </p>
<p>A site of this scale can benefit both internally and externally from good directory structure.  Internally, the site can remain organized and easy to navigate for any updates or maintenance.  Externally the site is more navigable, allowing accurate feedback about pages that require more links, or more clear links, to index properly.  A solid directory structure can also provide searching by category name, giving browsers better chance to land at your site. </p>
<p>Implementing a complex directory can also raise several concerns.  First, a site must designate a creator of, or agree upon, a directory structure.  The directory structure must follow a logic that can be understood by anyone needing access to the site, not just the primary webmaster.  As programmers change jobs or leave sites, directory logic may get lost or confused, and consequently difficult to manage.  Another potential problem is then implementing a new system.  A developer may devise a more accurate or logical directory structure than the current system.  How do we migrate to the new system?  Also, changing the directory structure, or parts of the directory, may require robust software, capable of making changes to entire branches of the directory at a time.  Does your site have the resources to handle such a change? </p>
<p>As with many changes in web development technology, a directory structure must be implemented correctly to function effectively.  Consider the depth and complexity of your directory, and consult with your webmasters to determine the best organizational structure.  Be sure to have several individuals with intimate knowledge of the structure.  Should you need to know or change the structure for any reason, your development team will be sure to have the tools to do so. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.templatesfactory.net/articles/google.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Importance of Unique Content for a Website</title>
		<link>http://www.templatesfactory.net/articles/importance-of-unique-content-for-a-website.html</link>
		<comments>http://www.templatesfactory.net/articles/importance-of-unique-content-for-a-website.html#comments</comments>
		<pubDate>Fri, 16 Mar 2007 09:23:41 +0000</pubDate>
		<dc:creator>Hasan</dc:creator>
				<category><![CDATA[Search Engines]]></category>

		<guid isPermaLink="false">http://74.53.81.218/articles/importance-of-unique-content-for-a-website.html</guid>
		<description><![CDATA[If content is king, unique content is the supreme master of the universe. It is important to have relevant content for both search engine optimization and visitors, but having unique, well-written content is crucial to the overall success of the online venture. What is Unique Content? Unique content, in its simplest form, is material on [...]]]></description>
			<content:encoded><![CDATA[<p>If content is king, unique content is the supreme master of the universe. It is important to have relevant content for both search engine optimization and visitors, but having unique, well-written content is crucial to the overall success of the online venture. </p>
<p><strong>What is Unique Content?</strong></p>
<p>Unique content, in its simplest form, is material on a website that is completely different from content anywhere else on the internet. It is unique.  The term usually refers to written words on the page, but can apply to other areas, such as charts or graphics, as well. </p>
<p><strong>Unique Content and SEO</strong></p>
<p>As they have changed, search engines are growing much more concerned with unique content. Although nobody knows the exact algorithm for the major search engines, it has become very clear that unique content is rewarded by increased page rank, and duplicate content is punished – sometimes quite severely.  Duplicate content is content that is identical to material found on another website.</p>
<p>Some websites that pull feeds from other websites or news services might seem to skirt around the unique content issue, but there is always more to a site than RSS feeds. The more original content on a page, the more highly it is regarded by search engines. If any portion of a website contains materials that are seen elsewhere, the owner can expect the subsequent penalty.</p>
<p><strong>Unique Content and Traffic</strong></p>
<p>While traffic is a component of SEO, it is the aspect most heavily affected by quality content. A website is much like any other retail business. If a patron visits, but can’t find what they are looking for or simply doesn’t like what the store is offering, they won’t buy and they probably won’t return. </p>
<p>For this reason, it is imperative that your site offer visitors what they are seeking.  Excellent content that is updated on a regular basis will not only appeal to patrons, but it will keep them coming back for more. Traffic is the driving force of the internet. The more targeted traffic you have, the more effective your site will be. </p>
<p><strong>Creating Unique Content</strong></p>
<p>There are many ways to create unique content for a website. The most obvious solution is to write it yourself. This seems simple enough, but many struggle with finding the words or topics needed to fill pages of text. A common solution to this problem is to outsource the content to professional writers.  </p>
<p>Another common solution is to find well-written articles on applicable topics and rewrite them. To avoid the duplicate content filters and penalties, at least 30% of the rewritten content should be unique. Of course, the higher the unique percentage, the better off the material will be.</p>
<p>However your unique content is created, it is crucial that you obtain it. An otherwise excellent website will flounder if content is not appealing to visitors or search engines. It seems after all these years, the pen remains mightier than the sword. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.templatesfactory.net/articles/importance-of-unique-content-for-a-website.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
