Archive for June, 2008

The Negative Aspect Of Outsourcing!

Wednesday, June 25th, 2008

Outsourcing is subcontracting work to a third-party company, either for product creation, design or some service. Outsourcing has always been carried out by many major businesses as well as smaller companies. As far as corporations go, this means business process outsourcing (BPO.)

Internet has long been bitten by the outsourcing bug and most online businesses tend to allot most of their product creation and some have started outsourcing their marketing as well. Internet marketers outsource ebook creation, article writing, web development and designing, software development, making lists of affiliate programs with all the details, sorting particular information into lists and every other job that is either time consuming or requires expertise.

Outsourcing is being described by some as an unfortunate byproduct of a global economy. The thinking is simply that when the same work can be done by someone else cheaper, get it done. While this is certainly a more profitable way of conducting a business, it does have its negative side too.

It is said to be for people who have little patience and a lot of money. There may be people who think that looking at the disadvantages of outsourcing amounts to thinking negatively; however, it is always better to consider all the disadvantages in advance to avoid any future surprises or should we says shocks.

One of the most crucial facts of life is that no third person or company can understand the product better than the owner. In spite of their best efforts, they may not be able to help business owners reach their goals.

Communication gap is said to be one of the demerits of outsourcing, where the person who the work is outsourced to may not be familiar with a particular culture and that lack of understanding could reflect in the work performed. (more…)

The Anatomy Of An Automated Search Engine!

Monday, June 23rd, 2008

The creation of a large-scale search engine is an onerous task and one that entails huge challenges.

A perfect automated search engine in the current scenario is one that crawls the web quickly and gathers all the documents to keep them up-to-date. Plenty of storage space is required to efficiently store indices or the documents themselves.

The magnitude of data that has to be handled on the ever-growing internet includes billions of queries daily. The indexing system of a search engine should be capable of processing huge amounts of data by using the space most efficiently and handling thousands of queries per second. The best navigation experience should be provided to the users, in the form of finding almost anything on the Web, excluding the junk results with the use of high precision tools, which is the main problem users’ face.

The anatomy of a search engine includes major applications such as crawling the web, indexing and searching.

Web Crawling

Search engines of today depend on spiders or robots, which are special software, designed to continuously search the web to find new pages.

Web crawling is the most important aspect of a search engine and is the most challenging. It involves interaction with thousands of web servers and name servers. It is performed by many fast distributed crawlers. They keep getting information regarding lists of URLs they need to crawl and store, from a URL server. The crawlers start their travel with the most used servers and highly popular pages. Each crawler keeps hundreds of connections open at one time in order to retrieve web pages quickly. The crawler has to look up the DNS, connect to the host, send a request and receive a response. It does not rank the web pages but retrieves copies of all the web pages and stores them in a repository by compressing them, to later index and rank them based on different criteria. Everything from the visible text, images, alt tags, other non-HTML content, word processor documents and more are indexed.

Crawlers usually visit the same web pages repeatedly to ensure the site is a stable one and that the pages are being updated frequently. If a certain web page is not functioning at some point, the crawlers are usually programmed to go back later to try again. However, if it is found that the page is either down continuously or not being updated frequently, they stay away for longer periods of time or index it slowly.

Crawlers also have the capability of following all the links found on the web pages, which they visit as an when they find them or visit them later. (more…)