Peter's Blog

Redefining the Impossible

Scraping WoW Armory in Ruby


I figured out how to scrape the Wow Armory for up-to-date information about my characters using ruby. This means I can do this directly in PetersBlogger without using third party signature generators that disappear overnight or get absorbed into wow gold sites.

   1    require 'net/http'
   2  
   3    #
   4    # Scrape wow character info.
   5    #
   6    def WoWInfo
   7      begin
   8        strRealm = 'Eonar'
   9  
  10        oInfo = []
  11  
  12        ['Pookypoo', 'Maevyn', 'Maezyn', 'Maexyn'].each do |strCharacter|
  13          oCharInfo = []
  14  
  15          #
  16          # Open url.
  17          # Need to specify firefox as user agent as this makes the server return an XML
  18          # file.
  19          # Look the data up in the european armory. Change this for US.
  20          # If this is not done we get html.
  21          oResp = Net::HTTP.start( "armory.wow-europe.com", 80) do |http|
  22                http.get( "/character-sheet.xml?r=#{strRealm.tr( ' ', '+')}&n=#{strCharacter}",
  23                            { 'user-agent' => 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4'})
  24          end
  25  
  26          oDoc = REXML::Document.new oResp.body
  27          oDoc.elements[1].elements.each( 'characterInfo/character') do |oElement|
  28            oCharInfo << {:race => oElement.attributes['race']}
  29            oCharInfo << {:class => oElement.attributes['class']}
  30            oCharInfo << {:level => oElement.attributes['level']}
  31          end
  32  
  33          oDoc.elements[1].elements.each( 'characterInfo/characterTab/talentSpec') do |oElement|
  34            oCharInfo << {:spec => "#{oElement.attributes['treeOne']}/#{oElement.attributes['treeTwo']}/#{oElement.attributes['treeThree']}"}
  35          end
  36  
  37          oDoc.elements[1].elements.each( 'characterInfo/characterTab/professions/skill') do |oElement|
  38            oCharInfo << {oElement.attributes['key'].to_sym => oElement.attributes['value']}
  39          end
  40  
  41          oInfo << {strCharacter => oCharInfo}
  42        end
  43  
  44        return oInfo
  45      rescue
  46        return []
  47      end
  48    end

This lot generates the info in this page which is linked to under 'My Characters' over to the right. It should be fairly up-to-date as the cache will flush once a day causing the armory to be requeried. Currently the output is presented in a rails view thus:

   1  <% cache( :part => :wow) do %>
   2    <table>
   3    <% WoWInfo().each do |oCharInfo| %>
   4      <% oCharInfo.each_pair do |strChar, oInfo| %>
   5        <tr><td colspan="2"><b><%= strChar %></b></td></tr>
   6        <% oInfo.each do |oItem| %>
   7          <% oItem.each_pair do |strItem, strValue| %>
   8            <tr>
   9              <td><%= strItem %>:</td><td><%= strValue %></td>
  10            </tr>
  11          <% end %>
  12        <% end %>
  13      <% end %>
  14    <% end %>
  15    </table>
  16  <% end %>

All the work is done by the WoWInfo method which I put in the application helper. I didn't use a controller method for this.

I'm going to think about a nicer presentation with some pictures of the actual guys rather than generic pictures of bald redhead dwarves.

UPDATE: I've added my old Aerie Peak toons to the output but the code above still stands. Lugulas was my old auction alt who I've never named here before, hence level 10 with level 75 (dis)enchanting. I find myself missing them...


Spikeles Says:

Just a note too, alot of sites have been having issues with processing data from the armoury. Apparently blizz is blocking ipranges that cause excessive requests. Some more info here about someone who has it done to them. ://be.imba.hu/forums/viewtopic.php?id=114

Caching would probably alleviate most of this, only request new data every so often.

Peter Says:

I've set this up to use rail's fragment caching (this blog uses that for all pages and things like the 'recent blog posts'). I can flush the cache daily so I should only hit the armory itself once a day, and only then if someone actually looks at that page.

UPDATE: Since spikeles posted this I have added the code for the view which quite clearly shows the use of fragment caching.

Sam Says:

I don't know if you've seen this, it might help you. http://wowr.rubyforge.org/

Still very Beta, but seems to work.

Peter Says:

Looks good and a gem too. My script is still working nicely for me, if it breaks through armory changes or I want more then I'll investigate this further.

Thanks for the link.

Have Your Say

I welcome constructive comments or questions but I reserve the right to delete any comments that displease me.

Who are you?

(Optional) If you enter an email address here I might email you back. Your email address will not be sold to spammers or shown anywhere

What do you have to say?