Host- and Domain-Level Web Graphs Nov/Dec/Jan 2017-2018 – Common Crawl
Common Information
Type Value
UUID f07ab2fd-6df0-40df-a858-077b030a1257
Fingerprint bf5faf5aac1fb980
Analysis status DONE
Considered CTI value 0
Text language
Published Feb. 8, 2018, midnight
Added to db Jan. 18, 2023, 7:45 p.m.
Last updated Nov. 17, 2024, 6:54 p.m.
Headline Common Crawl
Title Host- and Domain-Level Web Graphs Nov/Dec/Jan 2017-2018 – Common Crawl
Detected Hints/Tags/Attributes 56/1/199
Attributes
Details Type #Events CTI Value
Details Domain 206
www.example.com
Details Domain 6
www.subdomain.example.com
Details Domain 7
data.commoncrawl.org
Details Domain 1
cc-main-2017-18-nov-dec-jan-host.properties
Details Domain 1
cc-main-2017-18-nov-dec-jan-host-t.properties
Details Domain 8
publicsuffix.org
Details Domain 3
foo.blogspot.com
Details Domain 22
blogspot.com
Details Domain 77
amazonaws.com
Details Domain 1
cc-main-2017-18-nov-dec-jan-domain.properties
Details Domain 1
cc-main-2017-18-nov-dec-jan-domain-t.properties
Details Domain 42
com.google
Details Domain 7
com.youtube
Details Domain 359
com.apple
Details Domain 14
com.amazon
Details Domain 27
com.microsoft
Details Domain 6
gl.goo
Details Domain 6
com.flickr
Details Domain 8
com.yahoo
Details Domain 6
com.bing
Details Domain 6
com.imdb
Details Domain 188
com.android
Details Domain 8
com.oracle
Details Domain 6
uk.co.google
Details Domain 6
uk.co.bbc
Details Domain 5
com.nike
Details Domain 27
au.com
Details Domain 5
com.live
Details Domain 6
com.sap
Details Domain 5
com.chrome
Details Domain 6
com.godaddy
Details Domain 5
au.gov
Details Domain 6
edu.mit
Details Domain 6
com.aol
Details Domain 4
org.aarp
Details Domain 7
com.gmail
Details Domain 8
ru.yandex
Details Domain 11
com.ibm
Details Domain 5
gov.ca
Details Domain 24
uk.co
Details Domain 1
com.zappos
Details Domain 6
com.bloomberg
Details Domain 2
com.audible
Details Domain 1
tl.page
Details Domain 5
fr.free
Details Domain 6
com.bbc
Details Domain 6
de.google
Details Domain 6
uk.co.amazon
Details Domain 10
com.cisco
Details Domain 6
com.weibo
Details Domain 26
com.skype
Details Domain 6
com.box
Details Domain 9
com.samsung
Details Domain 5
uk.ac.cam
Details Domain 6
com.hotmail
Details Domain 5
com.uk
Details Domain 6
ca.google
Details Domain 8
uk.ac
Details Domain 5
com.office
Details Domain 6
de.amazon
Details Domain 10
com.booking
Details Domain 5
com.weather
Details Domain 1
jobs.amazon
Details Domain 7
com.java
Details Domain 6
com.inc
Details Domain 6
es.google
Details Domain 6
com.dell
Details Domain 6
org.ieee
Details Domain 4
com.americanexpress
Details Domain 6
fr.google
Details Domain 11
com.netflix
Details Domain 6
it.google
Details Domain 5
br.com.uol
Details Domain 6
au.net.abc
Details Domain 4
com.mlb
Details Domain 5
jp.co.yahoo
Details Domain 6
com.target
Details Domain 11
com.baidu
Details Domain 5
cn.com.sina
Details Domain 5
es.com
Details Domain 6
com.nba
Details Domain 4
gov.house
Details Domain 6
jp.co.google
Details Domain 6
com.nfl
Details Domain 6
com.globo
Details Domain 5
com.nokia
Details Domain 11
br.com
Details Domain 6
jp.ne
Details Domain 6
au.com.google
Details Domain 4
com.blog
Details Domain 6
com.playstation
Details Domain 5
edu.si
Details Domain 3
cc.co
Details Domain 6
com.bestbuy
Details Domain 6
com.boston
Details Domain 6
nl.google
Details Domain 6
jp.co.amazon
Details Domain 5
gov.nyc
Details Domain 2
com.hyatt
Details Domain 8
com.com
Details Domain 4
jp.ac
Details Domain 5
ca.amazon
Details Domain 6
com.walmart
Details Domain 6
in.co.google
Details Domain 2
com.ups
Details Domain 4
gov.dot
Details Domain 5
com.me
Details Domain 7
gd.is
Details Domain 4
com.xbox
Details Domain 5
au.com.news
Details Domain 5
com.intuit
Details Domain 6
fr.amazon
Details Domain 5
com.us
Details Domain 4
com.deloitte
Details Domain 4
com.space
Details Domain 3
net.yahoo
Details Domain 4
com.windows
Details Domain 17
com.alibaba
Details Domain 6
br.com.google
Details Domain 4
mil.navy
Details Domain 2
mil.army
Details Domain 5
com.today
Details Domain 3
com.fedex
Details Domain 2
org.eu
Details Domain 1
gov.energy
Details Domain 3
jp.ne.sakura
Details Domain 4
com.sky
Details Domain 2
nr.co
Details Domain 6
com.salon
Details Domain 3
google.blog
Details Domain 3
com.sony
Details Domain 4
com.accenture
Details Domain 2
gov.weather
Details Domain 3
ch.cern
Details Domain 1
com.lego
Details Domain 3
ru.google
Details Domain 3
au.edu
Details Domain 1
ly.cl
Details Domain 2
se.google
Details Domain 2
be.google
Details Domain 3
com.monster
Details Domain 3
net.java
Details Domain 3
com.taobao
Details Domain 4
es.amazon
Details Domain 3
jp.ne.goo
Details Domain 2
hk.com.google
Details Domain 1
ru.org
Details Domain 2
jp.or.nhk
Details Domain 1
cc-main-2018-jan-host.properties
Details Domain 1
cc-main-2018-jan-host-t.properties
Details Domain 1
cc-main-2018-jan-domain.properties
Details Domain 1
cc-main-2018-jan-domain-t.properties
Details File 816
index.html
Details File 4
paths.gz
Details File 1
cc-main-2017-18-nov-dec-jan-host-ranks.txt
Details File 1
cc-main-2017-18-nov-dec-jan-domain-vertices.txt
Details File 1
cc-main-2017-18-nov-dec-jan-domain-edges.txt
Details File 1
cc-main-2017-18-nov-dec-jan-domain-ranks.txt
Details File 7
com.tum
Details File 33
com.bin
Details File 8
org.py
Details File 7
net.php
Details File 6
edu.ps
Details File 8
com.geo
Details File 6
org.iso
Details File 4
it.pl
Details File 12
com.java
Details File 2
org.sql
Details File 20
com.ai
Details File 7
com.inc
Details File 6
com.js
Details File 3
it.bin
Details File 25
com.tar
Details File 5
de.spi
Details File 2
ly.pl
Details File 16
com.ps
Details File 54
com.pl
Details File 7
com.pas
Details File 7
com.webm
Details File 4
gov.dot
Details File 7
net.js
Details File 14
org.pl
Details File 26
com.cs
Details File 4
net.bat
Details File 20
com.doc
Details File 16
com.html
Details File 2
org.doc
Details File 26
com.inf
Details File 3
ch.cer
Details File 3
tt.db
Details File 3
net.java
Details File 1
cc-main-2018-jan-host-vertices.txt
Details File 1
cc-main-2018-jan-host-edges.txt
Details File 1
cc-main-2018-jan-host-ranks.txt
Details File 1
cc-main-2018-jan-domain-vertices.txt
Details File 1
cc-main-2018-jan-domain-edges.txt
Details File 1
cc-main-2018-jan-domain-ranks.txt
Details Url 1
https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/host
Details Url 1
https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/domain