[Oisf-users] URL reputation?
Rich Rumble
richrumble at gmail.com
Wed Jan 30 21:03:41 UTC 2013
On Wed, Jan 30, 2013 at 10:33 AM, Matt Jonkman <jonkman at jonkmans.com> wrote:
> Ya, I've had that on my mind for a while, but I think the scale issues are
> core.
>
> We have IP rep now, and shortly DNS rep that can be applied. I think a good
> number of url's can be knocked down with good domain rep.
>
> But I also think this is worth exploring. I wonder if there are any
> algorithms out there that could take a list of 200k url's and boil them down
> to a set of core prequalifying strings minus the domains?
>
> Or masking parameter values that vary in some automated way to get the least
> number of matches required?
I've done some of this as an experiment for myself, and trying to
integrate checking against the Google Safe Browsing API as well as
using "onSameHost" functions:
http://www.bing.com/search?q=ip%3A90.156.241.4 (000007.ru)
The "Same Host" function I use as extra meta data, just keeping a
record, it hasn't proven to be as effective as GSB-Api has. I need to
spend some more time with it, but if anyone needed or wanted to know
how to do "onsamehost", use *shudder* bing to do it. Unless Google has
added the IP: qualifier, and last I checked they hadn't.
(I can't believe "reply" doesn't reply to all on this list, I forget
everytime...)
-rich
More information about the Oisf-users
mailing list