export_to_phy vs using kilosort sorter output params.py directly

Hello! 
We realised in our lab recently that two of us were using two different ways to manually curate units with Phy after sorting with kilosort4. 
**The first is to use to use the `export_to_phy` function and then open the params.py file with Phy as described in the documentation (the exporting is very slow, sometimes slower than the actual sorting):**

```
analyzer = si.create_sorting_analyzer(sorting_KS4, recording_saved, sparse=True)
# compute all the extensions required
sexp.export_to_phy(sorting_analyzer=analyzer, output_folder=Path(data_dir) / 'phy_folder', verbose=True, copy_binary=False)

```
**The second is to just compute the extensions needed, save them as tsv files**, copy to the sorter output location where **params.py from the sorter output** is generated, and then just **open that with Phy without the `export_to_phy` function** (much faster, barely any extra compute time). 
```
sorting_analyzer = si.create_sorting_analyzer(sorting=sorting_KS4, recording=rec_corrected, format="binary_folder", folder = KSfolder / 'analyzer_med' )
contamination = sqm.compute_sliding_rp_violations(sorting_analyzer=sorting_analyzer,
                                                  bin_size_ms=0.25)

presence_ratio = sqm.compute_presence_ratios(sorting_analyzer=sorting_analyzer)

def save_dict_to_tsv(data, header_name, file_path, delimiter='\t'):
    """
    Saves a dictionary to a TSV file.

    Args:
        data (dict): The dictionary to save. Keys will be the header row.
        file_path (str): The path to the TSV file.
        delimiter (str, optional): The delimiter. Defaults to tab ('\t').
    """
    #with open(file_path, 'w', newline='', encoding='utf-8') as tsvfile:
    with open(Path(KSfolder) / 'sorter_output' / file_path, 'w', newline='', encoding='utf-8') as tsvfile:

        writer = csv.writer(tsvfile, delimiter=delimiter)

        writer.writerow(['cluster_id', header_name])
        for key, value in data.items():
            writer.writerow([key, value])

save_dict_to_tsv(contamination, 'sliding_rp', 'cluster_sliding_rp.tsv')
save_dict_to_tsv(presence_ratio, 'presence', 'cluster_presence.tsv')
# move these to the same folder that holds your params.py file for phy


```

We tried both for the same sorting, and the only thing that jumped out to us was that the second method resulted in fewer channels in the waveform view on Phy, but no other noticeable difference. 

What does export_to_phy do that takes so much time, and is it necessary to do it, since the second method seems to be working fine? Or are we missing something here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

export_to_phy vs using kilosort sorter output params.py directly #4635

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

export_to_phy vs using kilosort sorter output params.py directly #4635

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions