Merge pull request #1 from IBM-HRL-MLHLS/update-readme

ehudkr · web-flow · commit c4e56666a3ab · 2018-01-31T14:27:39.000+02:00
Update readme
diff --git a/README.md b/README.md
@@ -11,19 +11,34 @@ Framework for evaluating causal inference methods.
  - [Authors](#authors)
 
 ## General
-Causality-Benchmark is a library developed by IBM Research for benchmarking algorithms that 
-estimate causal effect.
-The framework includes unlabeled data, labeled data, and code for scoring algorithm predictions.  
-It can benchmark predictions of both population effect size and individual effect size.  
-
-The evaluation script is not bounded to the provided data, and can be used on other data as 
-long as some basic requirements are kept regarding the formats.  
-For more technical details about the evaluation metrics and the data, please refer to the 
-framework menuscript **TODO: Link to the menuscript/technical report**
-
-Please note that due to GitHub limitation, only a sample of the data is available in this 
-repository. However, you can manually access and download the entire dataset from the 
-[Synapse sharing platform](https://www.synapse.org/#!Synapse:syn11294478/files/)
+Causality-Benchmark is a library developed by IBM Research Haifa for 
+benchmarking algorithms that estimate the causal effect of a treatment on 
+some outcome. The framework includes unlabeled data, labeled data, and code 
+for scoring algorithm predictions. It can benchmark predictions of both 
+population effect size and individual effect size.  
+
+The feature matrix is derived from the 
+[linked birth and infant death data](https://www.cdc.gov/nchs/nvss/linked-birth.htm),
+and the labeled and unlabeled data are based on simulated models of the 
+treatment assignment, treatment effect, and censoring.
+
+The evaluation script is not bounded to the provided data, 
+and can be used on other data as 
+long as some basic requirements are kept regarding the formats. 
+Full technical details regarding the calculated metrics and the formats of the 
+labeled and unlabeled data will be published soon through a related manuscript. 
+Meanwhile, most of the details can be found in the 
+[2018 Casual Inference Challenge](https://www.synapse.org/ACIC2018Challenge)
+website.
+
+Please note that due to GitHub limitation, only a sample of the data is 
+available in this repository. However, you can manually access and download 
+the entire dataset from the 
+[Synapse sharing platform](https://www.synapse.org/#!Synapse:syn11294478/files/).
+Furthermore, since the benchmarking tool is used in the 
+[Casual Inference Challenge 2018](https://www.synapse.org/#!Synapse:syn11294478),
+the dataset currently includes a handful of example data with labels.
+The full set of labeled data will be available when the challenge ends.
 
 ## Getting Started
 ### Prerequisites
@@ -82,7 +97,7 @@ scores = evaluate(PATH_TO_PREDICTION_OUTPUT, PATH_TO_COUNTERFACTUAL_FILES_DIRECT
                   individual_prediction=True)
 ```
 ##### Expected Files
-* The counterfactual files (holding $y^1$, $y^0$ for each individual), are expected to be a
+* The counterfactual outcomes files (holding $y^1$, $y^0$ for each individual), are expected to be a
   directory with different comma-separated-files and their file names corresponding to the
   data-instance but having some suffix (e.g. `"_cf.csv"`).
 * The predictions for population effect size are expected to be one comma-delimited-file with
@@ -91,7 +106,7 @@ scores = evaluate(PATH_TO_PREDICTION_OUTPUT, PATH_TO_COUNTERFACTUAL_FILES_DIRECT
   comma-delimited-files, each corresponding to a data-instance and each containing the
   estimated outcome under no-treatment and under positive treatment.
 
-For full explanation, please refer to the menuscript **TODO: link to menuscript** 
+For full explanation, please refer to the menuscript.
 
 #### Estimation
 To avoid inflating file sizes for nothing, 
@@ -128,10 +143,7 @@ The current content is open source under Apache License 2.0. For full specificat
 [License.txt](License.txt)
 
 ## Authors
-* bullets (link to personal github profile)
-* of 
-* authors' (link to personal site)
-* names
-
-
+* Yishai Shimoni ([Homepage](http://researcher.watson.ibm.com/researcher/view.php?person=il-YISHAIS))
+* Chen Yanover ([Homepage](http://researcher.watson.ibm.com/researcher/view.php?person=il-CHENY))
+* Ehud Karavani ([Github](https://github.com/ehudkr))