Host- and Domain-Level Web Graphs Nov/Dec/Jan 2017-2018 – Common Crawl
Tags
attack-pattern: | Data Domains - T1583.001 Domains - T1584.001 Python - T1059.006 Software - T1592.002 |
Common Information
Type | Value |
---|---|
UUID | f07ab2fd-6df0-40df-a858-077b030a1257 |
Fingerprint | bf5faf5aac1fb980 |
Analysis status | DONE |
Considered CTI value | 0 |
Text language | |
Published | Feb. 8, 2018, midnight |
Added to db | Jan. 18, 2023, 7:45 p.m. |
Last updated | Nov. 17, 2024, 6:54 p.m. |
Headline | Common Crawl |
Title | Host- and Domain-Level Web Graphs Nov/Dec/Jan 2017-2018 – Common Crawl |
Detected Hints/Tags/Attributes | 56/1/199 |
Source URLs
Redirection | Url | |
---|---|---|
Details | Source | http://commoncrawl.org/2018/02/webgraphs-nov-dec-2017-jan-2018/ |
URL Provider
Attributes
Details | Type | #Events | CTI | Value |
---|---|---|---|---|
Details | Domain | 206 | www.example.com |
|
Details | Domain | 6 | www.subdomain.example.com |
|
Details | Domain | 7 | data.commoncrawl.org |
|
Details | Domain | 1 | cc-main-2017-18-nov-dec-jan-host.properties |
|
Details | Domain | 1 | cc-main-2017-18-nov-dec-jan-host-t.properties |
|
Details | Domain | 8 | publicsuffix.org |
|
Details | Domain | 3 | foo.blogspot.com |
|
Details | Domain | 22 | blogspot.com |
|
Details | Domain | 77 | amazonaws.com |
|
Details | Domain | 1 | cc-main-2017-18-nov-dec-jan-domain.properties |
|
Details | Domain | 1 | cc-main-2017-18-nov-dec-jan-domain-t.properties |
|
Details | Domain | 42 | com.google |
|
Details | Domain | 7 | com.youtube |
|
Details | Domain | 359 | com.apple |
|
Details | Domain | 14 | com.amazon |
|
Details | Domain | 27 | com.microsoft |
|
Details | Domain | 6 | gl.goo |
|
Details | Domain | 6 | com.flickr |
|
Details | Domain | 8 | com.yahoo |
|
Details | Domain | 6 | com.bing |
|
Details | Domain | 6 | com.imdb |
|
Details | Domain | 188 | com.android |
|
Details | Domain | 8 | com.oracle |
|
Details | Domain | 6 | uk.co.google |
|
Details | Domain | 6 | uk.co.bbc |
|
Details | Domain | 5 | com.nike |
|
Details | Domain | 27 | au.com |
|
Details | Domain | 5 | com.live |
|
Details | Domain | 6 | com.sap |
|
Details | Domain | 5 | com.chrome |
|
Details | Domain | 6 | com.godaddy |
|
Details | Domain | 5 | au.gov |
|
Details | Domain | 6 | edu.mit |
|
Details | Domain | 6 | com.aol |
|
Details | Domain | 4 | org.aarp |
|
Details | Domain | 7 | com.gmail |
|
Details | Domain | 8 | ru.yandex |
|
Details | Domain | 11 | com.ibm |
|
Details | Domain | 5 | gov.ca |
|
Details | Domain | 24 | uk.co |
|
Details | Domain | 1 | com.zappos |
|
Details | Domain | 6 | com.bloomberg |
|
Details | Domain | 2 | com.audible |
|
Details | Domain | 1 | tl.page |
|
Details | Domain | 5 | fr.free |
|
Details | Domain | 6 | com.bbc |
|
Details | Domain | 6 | de.google |
|
Details | Domain | 6 | uk.co.amazon |
|
Details | Domain | 10 | com.cisco |
|
Details | Domain | 6 | com.weibo |
|
Details | Domain | 26 | com.skype |
|
Details | Domain | 6 | com.box |
|
Details | Domain | 9 | com.samsung |
|
Details | Domain | 5 | uk.ac.cam |
|
Details | Domain | 6 | com.hotmail |
|
Details | Domain | 5 | com.uk |
|
Details | Domain | 6 | ca.google |
|
Details | Domain | 8 | uk.ac |
|
Details | Domain | 5 | com.office |
|
Details | Domain | 6 | de.amazon |
|
Details | Domain | 10 | com.booking |
|
Details | Domain | 5 | com.weather |
|
Details | Domain | 1 | jobs.amazon |
|
Details | Domain | 7 | com.java |
|
Details | Domain | 6 | com.inc |
|
Details | Domain | 6 | es.google |
|
Details | Domain | 6 | com.dell |
|
Details | Domain | 6 | org.ieee |
|
Details | Domain | 4 | com.americanexpress |
|
Details | Domain | 6 | fr.google |
|
Details | Domain | 11 | com.netflix |
|
Details | Domain | 6 | it.google |
|
Details | Domain | 5 | br.com.uol |
|
Details | Domain | 6 | au.net.abc |
|
Details | Domain | 4 | com.mlb |
|
Details | Domain | 5 | jp.co.yahoo |
|
Details | Domain | 6 | com.target |
|
Details | Domain | 11 | com.baidu |
|
Details | Domain | 5 | cn.com.sina |
|
Details | Domain | 5 | es.com |
|
Details | Domain | 6 | com.nba |
|
Details | Domain | 4 | gov.house |
|
Details | Domain | 6 | jp.co.google |
|
Details | Domain | 6 | com.nfl |
|
Details | Domain | 6 | com.globo |
|
Details | Domain | 5 | com.nokia |
|
Details | Domain | 11 | br.com |
|
Details | Domain | 6 | jp.ne |
|
Details | Domain | 6 | au.com.google |
|
Details | Domain | 4 | com.blog |
|
Details | Domain | 6 | com.playstation |
|
Details | Domain | 5 | edu.si |
|
Details | Domain | 3 | cc.co |
|
Details | Domain | 6 | com.bestbuy |
|
Details | Domain | 6 | com.boston |
|
Details | Domain | 6 | nl.google |
|
Details | Domain | 6 | jp.co.amazon |
|
Details | Domain | 5 | gov.nyc |
|
Details | Domain | 2 | com.hyatt |
|
Details | Domain | 8 | com.com |
|
Details | Domain | 4 | jp.ac |
|
Details | Domain | 5 | ca.amazon |
|
Details | Domain | 6 | com.walmart |
|
Details | Domain | 6 | in.co.google |
|
Details | Domain | 2 | com.ups |
|
Details | Domain | 4 | gov.dot |
|
Details | Domain | 5 | com.me |
|
Details | Domain | 7 | gd.is |
|
Details | Domain | 4 | com.xbox |
|
Details | Domain | 5 | au.com.news |
|
Details | Domain | 5 | com.intuit |
|
Details | Domain | 6 | fr.amazon |
|
Details | Domain | 5 | com.us |
|
Details | Domain | 4 | com.deloitte |
|
Details | Domain | 4 | com.space |
|
Details | Domain | 3 | net.yahoo |
|
Details | Domain | 4 | com.windows |
|
Details | Domain | 17 | com.alibaba |
|
Details | Domain | 6 | br.com.google |
|
Details | Domain | 4 | mil.navy |
|
Details | Domain | 2 | mil.army |
|
Details | Domain | 5 | com.today |
|
Details | Domain | 3 | com.fedex |
|
Details | Domain | 2 | org.eu |
|
Details | Domain | 1 | gov.energy |
|
Details | Domain | 3 | jp.ne.sakura |
|
Details | Domain | 4 | com.sky |
|
Details | Domain | 2 | nr.co |
|
Details | Domain | 6 | com.salon |
|
Details | Domain | 3 | google.blog |
|
Details | Domain | 3 | com.sony |
|
Details | Domain | 4 | com.accenture |
|
Details | Domain | 2 | gov.weather |
|
Details | Domain | 3 | ch.cern |
|
Details | Domain | 1 | com.lego |
|
Details | Domain | 3 | ru.google |
|
Details | Domain | 3 | au.edu |
|
Details | Domain | 1 | ly.cl |
|
Details | Domain | 2 | se.google |
|
Details | Domain | 2 | be.google |
|
Details | Domain | 3 | com.monster |
|
Details | Domain | 3 | net.java |
|
Details | Domain | 3 | com.taobao |
|
Details | Domain | 4 | es.amazon |
|
Details | Domain | 3 | jp.ne.goo |
|
Details | Domain | 2 | hk.com.google |
|
Details | Domain | 1 | ru.org |
|
Details | Domain | 2 | jp.or.nhk |
|
Details | Domain | 1 | cc-main-2018-jan-host.properties |
|
Details | Domain | 1 | cc-main-2018-jan-host-t.properties |
|
Details | Domain | 1 | cc-main-2018-jan-domain.properties |
|
Details | Domain | 1 | cc-main-2018-jan-domain-t.properties |
|
Details | File | 816 | index.html |
|
Details | File | 4 | paths.gz |
|
Details | File | 1 | cc-main-2017-18-nov-dec-jan-host-ranks.txt |
|
Details | File | 1 | cc-main-2017-18-nov-dec-jan-domain-vertices.txt |
|
Details | File | 1 | cc-main-2017-18-nov-dec-jan-domain-edges.txt |
|
Details | File | 1 | cc-main-2017-18-nov-dec-jan-domain-ranks.txt |
|
Details | File | 7 | com.tum |
|
Details | File | 33 | com.bin |
|
Details | File | 8 | org.py |
|
Details | File | 7 | net.php |
|
Details | File | 6 | edu.ps |
|
Details | File | 8 | com.geo |
|
Details | File | 6 | org.iso |
|
Details | File | 4 | it.pl |
|
Details | File | 12 | com.java |
|
Details | File | 2 | org.sql |
|
Details | File | 20 | com.ai |
|
Details | File | 7 | com.inc |
|
Details | File | 6 | com.js |
|
Details | File | 3 | it.bin |
|
Details | File | 25 | com.tar |
|
Details | File | 5 | de.spi |
|
Details | File | 2 | ly.pl |
|
Details | File | 16 | com.ps |
|
Details | File | 54 | com.pl |
|
Details | File | 7 | com.pas |
|
Details | File | 7 | com.webm |
|
Details | File | 4 | gov.dot |
|
Details | File | 7 | net.js |
|
Details | File | 14 | org.pl |
|
Details | File | 26 | com.cs |
|
Details | File | 4 | net.bat |
|
Details | File | 20 | com.doc |
|
Details | File | 16 | com.html |
|
Details | File | 2 | org.doc |
|
Details | File | 26 | com.inf |
|
Details | File | 3 | ch.cer |
|
Details | File | 3 | tt.db |
|
Details | File | 3 | net.java |
|
Details | File | 1 | cc-main-2018-jan-host-vertices.txt |
|
Details | File | 1 | cc-main-2018-jan-host-edges.txt |
|
Details | File | 1 | cc-main-2018-jan-host-ranks.txt |
|
Details | File | 1 | cc-main-2018-jan-domain-vertices.txt |
|
Details | File | 1 | cc-main-2018-jan-domain-edges.txt |
|
Details | File | 1 | cc-main-2018-jan-domain-ranks.txt |
|
Details | Url | 1 | https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/host |
|
Details | Url | 1 | https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/domain |