First Steps

The first step before running diffusion algorithms on your network using DiffuPy is to learn about the graph and data formats are supported. Next, you can find samples of input datasets and networks to run diffusion methods over.

Input Data

You can submit your dataset in any of the following formats:

  • CSV (.csv)

  • TSV (.tsv)

Please ensure that the dataset minimally has a column ‘Node’ containing node IDs. You can also optionally add the following columns to your dataset:

  • NodeType

  • LogFC *

  • p-value

*

Log2 fold change

Input dataset examples

DiffuPath accepts several input formats which can be codified in different ways. See the diffusion scores summary for more details.

  1. You can provide a dataset with a column ‘Node’ containing node IDs.

Node

A

B

C

D

2. You can also provide a dataset with a column ‘Node’ containing node IDs as well as a column ‘NodeType’, indicating the entity type of the node to run diffusion by entity type.

Node

NodeType

A

Gene

B

Gene

C

Metabolite

D

Gene

3. You can also choose to provide a dataset with a column ‘Node’ containing node IDs as well as a column ‘logFC’ with their logFC. You may also add a ‘NodeType’ column to run diffusion by entity type.

Node

LogFC

A

4

B

-1

C

1.5

D

3

4. Finally, you can provide a dataset with a column ‘Node’ containing node IDs, a column ‘logFC’ with their logFC and a column ‘p-value’ with adjusted p-values. You may also add a ‘NodeType’ column to run diffusion by entity type.

Node

LogFC

p-value

A

4

0.03

B

-1

0.05

C

1.5

0.001

D

3

0.07

See the sample datasets directory for example files.

Networks

If you would like to submit your own networks, please ensure they are in one of the following formats:

  • BEL (.bel)

  • CSV (.csv)

  • Edge list (.lst)

  • GML (.gml or .xml)

  • GraphML (.graphml or .xml)

  • Pickle (.pickle). BELGraph object from PyBEL 0.13.2

  • TSV (.tsv)

  • TXT (.txt)

Minimally, please ensure each of the following columns are included in the network file you submit:

  • Source

  • Target

Optionally, you can choose to add a third column, “Relation” in your network (as in the example below). If the relation between the Source and Target nodes is omitted, and/or if the directionality is ambiguous, either node can be assigned as the Source or Target.

Custom-network example

Source

Target

Relation

A

B

Increase

B

C

Association

A

D

Association

You can also take a look at our sample networks folder for some examples.