项目作者: linkedin

项目描述 :
用于高级特征工程的库和工具
高级语言: Java
项目地址: git://github.com/linkedin/FeatureFu.git
创建时间: 2015-04-21T23:26:41Z
项目社区:https://github.com/linkedin/FeatureFu

开源协议:Apache License 2.0

下载


FeatureFu

FeatureFu[l] contains a collection of library/tools for advanced feature engineering, such as using extended s-expression based feature transformation, to derive features on top of other features, or convert a light weighted model (logistical regression or decision tree) into a feature, in an intuitive way without touching any code.

Sample use cases:

  1. Feature normalization

    “(min 1 (max (+ (* slope x) intercept) 0))” : scale feature x with slope and intercept, and normalize to [0,1]

  2. Feature combination

    “(‐ (log2 (+ 5 impressions)) (log2 (+ 1 clicks)))” : combine #impression and #clicks into a smoothed CTR style feature

  3. Nonlinear featurization

    “(if (> query_doc_matches 0) 0 1)” : negation of a query/document matching feature

  4. Cascading modeling

    “(sigmoid (+ (+ (..) w1) w0))” : convert a logistic regression model into a feature

  5. Model combination (e.g. combine decision tree and linear regression)

    “(+ ( model1_score w1) ( model2_score w2))” : combine two model scores into one final score

Expr: A super fast and simple evaluator for mathematical s-expressions written in Java.

Using it is as simple as:

  1. VariableRegistry variableRegistry=new VariableRegistry();
  2. Expr expression = Expression.parse("(sigmoid (+ (* a x) b))",variableRegistry);
  3. Variable x = variableRegistry.findVariable("x");
  4. Variable a = variableRegistry.findVariable("a");
  5. Variable b = variableRegistry.findVariable("b");
  6. expression.evaluate();
  7. Map<String,Double> varMap = new HashMap<String,Double>();
  8. varMap.put("x",0.2);
  9. varMap.put("a",0.6);
  10. varMap.put("b",0.8);
  11. variableRegistry.refresh(varMap);
  12. expression.evaluate();

To Build

gradle clean build

Test

  1. cd build/expr/lib
  2. $java -cp expr-1.0.jar Expression "(+ 0.5 (* (/ 15 1000) (ln (- 55 12))))"
  3. =(0.5+((15.0/1000.0)*ln((55.0-12.0))))
  4. =0.5564180017354035
  5. tree
  6. └── +
  7. ├── 0.5
  8. └── *
  9. ├── /
  10. | ├── 15.0
  11. | └── 1000.0
  12. └── ln
  13. └── -
  14. ├── 55.0
  15. └── 12.0

Maven

expr is available under:

  1. <dependency>
  2. <groupId>com.linkedin.featurefu</groupId>
  3. <artifactId>expr</artifactId>
  4. <version>0.0.3</version>
  5. </dependency>

Gradle

dependencies {
compile “com.linkedin.featurefu:expr:0.0.3”
}