UniBind Changelog

UniBind Changelog

January 31, 2019

Bulk download modification: The TFBS BED and FASTA files available for bulk download now contain the PFM ID in the file name. For the transcription factors that have PFM variants (e.g.,TFAP2C, JUND), the files are no longer concatenated and the duplicates removed, but for each variant there is an individual file. Example file name: dataset_cellline_tf_name_pfm_id.bed

April 04, 2019

1. The OCT4 entries were renamed to POU5F1 due to the change in gene naming, more explicitly the GSE17917.OCT4.ESC, GSE200650.OCT4.ESC, and GSE21614.OCT4.BGO3

2. All the file names and folders have been renamed to follow a uniform format:
[dataset_id].[cell_type]_[condition(s)].[TF_name].[jaspar_id].[jaspar_version].[computational_model].[extension]
For example: GSE83860.LNCAP_dht_tnfa.FOXA1.MA0148.3.pwm.bed

April 28, 2019

Duplicated entries removed from the BED files and the FASTA files. Some of the datasets presented a few number of duplicates due to the close proximity of detected MACS2 peaks. Even if treated as individual peaks, in some cases the top scoring TFBS sequence was the same, resulting in duplicated entries.

July 17, 2019

1. The computational model scores for each TFBS are now added in the BED files as the 5th column.

2. Bug fix: for the BEM model, non-DiMO-optimized PWMs were used, which is now fixed and the corresponding files were updated.

3. For some data sets, the enrichment zone heatmap plots failed due to errors in the bandwidth calculation and consequently the other plots failed for that data set. Where possible, this error was fixed, if not the text "Not available" is present instead of the plot(s).