There are three levels of the world wide web – the Surface web, the Deep web and the Dark web.
The Surface web is the part of the web that shows up in conventional search engine results. When searching through Google, Yahoo or Bing, for example, the results are web pages from the Surface web. Conventional search engines index web pages as they crawl the web by moving along hyperlinks. If there is no link to a web page, then the search engine web crawler does not find it.
The Deep web is the part of the web which does not show up on conventional search engines. The Deep web stems from:
– Unlinked pages – pages that are not hyperlinked from another page
– Dynamic content created by a query, such as a site search query
– Private sites that require login or pages only accessible through a form
– Contextual web – pages that vary according to context such as previous pages visited, IP range
– Use of Robot Exclusion Standard to limit access to web crawlers
– Non-HTML content – textual content encoded in file formats not handled by search engines
The Dark web is the portion of the web that requires anonymous access such as through Tor, I2P or Freenet. There are many legitimate reasons for anonymous Tor network usage, such as communications by foreign offices, intelligence agents, law enforcement officers, and residents living in countries controlled by oppressive regimes. The Tor network is also used by criminals to ensure their anonymity. Grams is a new search engine for the dark web.
The Silk Road is an underground marketplace accessible through the Tor network, used for anonymising criminal activities in areas such as narcotics, firearms, child pornography, stolen goods and stolen data. Bitcoin is the currency of choice for most dark web illegal transactions. It also is the market for Cybercrime-as-a-Service (CaaS) – the outsourcing of different elements of cyber crime such as malware creation and distribution, botnets, mules and money laundering. Even though the Silk Road was shut down in October 2013 with the arrest of Ross Ulbricht, other Tor markets such as Silk Road 2.0 have since sprung up. It could be that the criminal marketplace will disperse into several dark web market places.
Even though methodical research on the size of the deep web has not been undertaken for over a decade, it is thought that the deep web is many times larger than the surface web – estimates range from tens of times larger to hundreds of times larger. This is normally depicted graphically as an iceberg with the surface web represented by the berg above the water surface, while the underwater portion represents the deep web. Although many may think of the deep web as largely comprising criminal activity, this is not the case – most of the information inaccessible to conventional search engines comprises legitimate government, academic and enterprise data.
At SentryBay we are developing our own methods of searching the deep web for sensitive customer data. It involves a web crawler identifying sites of interest – after pointing it in the right direction, the crawler scurries off into the deep corners and crevices of the web looking for sites satisfying specific criteria. When poking around the deep web, it is important to avoid the dark web material as it is illegal in most countries to download (and be in possession of) certain types of data.
Much of the surface web comprises companies harvesting vast amounts of data from users for advertising purposes. This, combined with mass governmental surveillance, will drive increasing numbers of users toward the anonymous portion of the web – the deep and dark web. An increase in activity in the deep web has security implications as well.