Archive for June, 2008

Negative Aspects Of Outsourcing

Wednesday, June 25th, 2008

Outsourcing is subcontracting work to a third-party company or individual for product creation, design, or some other service. Outsourcing has always been carried out by both major businesses and smaller companies. As far as corporations go, this means business process outsourcing (BPO).

The Internet has long been bitten by the outsourcing bug and most online businesses tend to allot most of their product creation to outsourcing, and some have started outsourcing their marketing as well. Internet marketers outsource projects like e-book creation, article writing, Web development and design, software development, making lists of affiliate programs with all the details, sorting particular information into lists, and every other job that is either time consuming or that requires specific expertise they don’t personally possess.

Outsourcing is being described by some as an unfortunate by-product of a global economy. The thinking is simply that when the same work can be done by someone else cheaper, get it done. While this is certainly a more profitable way of conducting a business, it does have its negative side too.

Outsourcing is said to be for people who have little patience and a lot of money. There may be people who think that looking at the disadvantages of outsourcing amounts to thinking negatively; however, it is always better to consider all the disadvantages in advance to avoid any future surprises (or should we say “shocks?”).

One of the most crucial facts of life is that no third person or company can understand the product better than the owner. In spite of their best efforts, they may not be able to help business owners reach their goals.

The potential for a communication gap is said to be one of the demerits of outsourcing, where the person who the work is outsourced to may not be familiar with a particular culture and that lack of understanding could reflect in the work performed. (more…)

The Anatomy of an Automated Search Engine

Monday, June 23rd, 2008

The creation of a large-scale search engine is an onerous task and one that involves huge challenges.

A perfect automated search engine is one that crawls the Web quickly to gather all documents regularly to keep them up-to-date. Plenty of storage space is required to efficiently store indices or the documents themselves.

The magnitude of data that has to be handled on the ever-growing Internet involves billions of queries daily. The indexing system of a search engine should be capable of processing huge amounts of data by using its space most efficiently and handling thousands of queries per second. The best navigation experience should be provided to the users in the form of finding almost anything on the Web, excluding junk results, with the use of high precision tools.

The anatomy of a search engine includes major applications such as those that allow for crawling the Web, indexing, and searching.

Web Crawling

Search engines today depend on spiders or robots (special software) designed to continuously search the Web to find new pages.

Web crawling is the most important aspect of a search engine and is the also most challenging. It involves interaction with thousands of Web servers and name servers. It is performed by many fast, distributed crawlers. They keep getting information regarding lists of URLs they need to crawl and store from a URL server. The crawlers start their travel with the most used servers and highly popular pages. Each crawler keeps hundreds of connections open at one time in order to retrieve Web pages quickly. The crawler has to look up the DNS, connect to the host, send a request and receive a response. It does not rank the Web pages, but retrieves copies of all them and stores them in a repository by compressing them. They’re later indexed and ranked based on different criteria. Everything from the visible text, images, alt tags, other non-HTML content, word processor documents, and more is indexed.

Crawlers usually visit the same Web pages repeatedly to ensure the site is a stable one and that the pages are being updated frequently. If a certain Web page is not functioning at some point, the crawlers are usually programmed to go back later to try again. However, if it is found that the page is either down continuously or not being updated frequently, they stay away for longer periods of time or index it slowly.

Crawlers also have the capability of following all the links found on Web pages, which they can then visit either right away or later. (more…)