Wednesday, 7 November 2012

Script to scan or monitor Pastebin

I thought I will share a script that I came up with to scan pastebin for any information that may be of interest to you or your organisation. The script is written in Ruby. There is already similar work done by others (a lot better than mine).

http://blog.rootshell.be/2012/01/17/monitoring-pastebin-com-within-your-siem/
http://www.shellguardians.com/
https://isc.sans.edu/diary.html?storyid=12091

Basically I have used the idea mentioned in each of the above link to come up up with something that worked for me. Also a lot in my script depends on how Google return search results at the moment. So it is bound to break some day, but hopefully since I have written it I hope I will be able to fix it.

The idea is to use Google to search for strings you are interested in on the pastebin website. The output is written to a file which is used to keep track of all the search results. This file can then be included in your Security Dashboard.

The script is separated into two files. One is to actually run a Google search and save the results in a temp file. The other compares the temp file with the stored results file and appends only the new entries to the results file. (Please check the other update pastebin script post)

File 1 - Google_adv.rb


# Created November 2012 - requires file_cmp.rb to get unique results in file results.txt

# I am using open-uri to do the web requests
require 'open-uri'
# file_cmp.rb is the second ruby script which does the file comparison
require 'file_cmp.rb'

# Open a temp file to write your results

f_tmp = open 'res.tmp', 'w+'

# Replace "keywords of interest" with terms you want to look for on pastebin

search_string = "keywords of interest site:pastebin.com"

# URL encode the search string so that spaces or + signs in the search 

string dont break the search query
enc_str = URI::encode(search_string)

#just an array to hold regex matches. Used later

i = Array.new()

#Run the google query and pass it to the block to do some regex matching

open("http://www.google.com.au/search?sclient=psy-ab&hl=en&site=&source=hp&q=#{enc_str}&btnG=Search") { |url|

# check each line using regular expression

url.each_line { |line|

# The regex picks out the pastebin urls that Google found interesting text in

    re=/()(pastebin.com\/\w+)(<\/cite>)/

# I used scan function so that I could get all matches

i = line.scan(re)

# if there are any matches write then to the temp file

if i.size != 0
i.each { | url |
url.to_s.scan(/pastebin.com\/\w+/) { |got_it| 
f_tmp.print got_it + "\n"

}
end
}
}
# close the file and call the file comparison script
f_tmp.close
fileCompare()

File 2 - file_cmp.rb

# Created November 2012 - required in Google_adv.rb
# Function that is called by Google_adv.rb 
def fileCompare 
# Open files to read the contents
f_res = open 'results.txt', 'r+'
f_tmp = open 'res.tmp', 'r+'

arr_tmp = Array.new()
arr_res = Array.new()

# Contents of the files are stored in respective arrays
f_tmp.each_line {|tmp| arr_tmp << tmp}
f_res.each_line {|res| arr_res << res}

f_res.close
f_tmp.close
# Open results file in append mode so append any new pastebin entries
f_res = open 'results.txt', 'a+'

# Compare the arrays to remove duplicate entries
arr_res.each { |uri| 
arr_tmp.delete_if { |tmp_uri| tmp_uri == uri}
}

# Append the results and close the file
arr_tmp.each { |i| f_res.puts i }
f_res.close
end


Place both the ruby scripts in the same folder. The results are stored in the same folder in a file named "Results.txt". The script needs lot of improvements in terms of error handling and it can also be developed to parse results from sites other than pastebin (mentioned in the ISC link above). This works for me right now.

(Checked on Ruby 1.8.7)

Disclaimer - The scripts I write are just a way for me to learn new ways of using Ruby. You can use the script but I am not responsible in any way if Google or Pastebin block your IP address if the script is run in a way that violates their T&C. It may also have bugs. All scripts supplied on this site are provided as-is and no liability is accepted for any accidental harm this script may cause.

No comments:

Post a Comment