{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f4ca731c",
   "metadata": {},
   "source": [
    "# S3. Score Consistency\n",
    "\n",
    "Here, we briefly demonstrate the similarities and discrepancies between LIANA's output in python and R. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c28005d4",
   "metadata": {
    "vscode": {
     "languageId": "r"
    }
   },
   "outputs": [],
   "source": [
    "library(liana, quietly = TRUE)\n",
    "library(plyr, quietly = TRUE)\n",
    "library(parallel, quietly = TRUE)\n",
    "\n",
    "data.path <- file.path('..', '..', 'data')\n",
    "output_folder <- file.path(data.path, 'liana-outputs/')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "4000f164",
   "metadata": {},
   "outputs": [],
   "source": [
    "# parallelization\n",
    "n.cores<-parallel::detectCores() # default to all\n",
    "if (n.cores<=1){\n",
    "    par = F\n",
    "}else{\n",
    "    par = T\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c6baad9f",
   "metadata": {},
   "source": [
    "#### Comparison with Python output:\n",
    "\n",
    "There are minor differences with the LIANA implementation in python that lead to outputs not being identical\n",
    "\n",
    "- SingleCellSignalR Magnitude (LRscore): precision - slightly different after 3rd decimal place\n",
    "- LogFC Specificity (logfc_comb): similar relative differences but different exact values\n",
    "- CellPhoneDB Specificity (pvalue): similar relative differences but different exact values\n",
    "- CellChat: not run by default in R\n",
    "\n",
    "Let's check the consistency in the magnitude aggregate rank score when running the different methods that report magnitude (excluding CellChat, which is not present by default in R). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "838965c5",
   "metadata": {
    "vscode": {
     "languageId": "r"
    }
   },
   "outputs": [],
   "source": [
    "covid_data <- readRDS(file.path(data.path, 'covid_balf_norm.rds'))\n",
    "md <- covid_data@colData\n",
    "barcodes <- rownames(md[md$sample == 'C100', ])\n",
    "\n",
    "sdata <- covid_data[, barcodes]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "cc8e2afd",
   "metadata": {
    "vscode": {
     "languageId": "r"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Running LIANA with `cell.type` as labels!\n",
      "\n",
      "`Idents` were converted to factor\n",
      "\n",
      "Cell identities with less than 5 cells: Plasma were removed!\n",
      "\n",
      "Warning message in exec(output, ...):\n",
      "“5591 genes and/or 0 cells were removed as they had no counts!”\n",
      "Warning message:\n",
      "“\u001b[1m\u001b[22m`invoke()` is deprecated as of rlang 0.4.0.\n",
      "Please use `exec()` or `inject()` instead.\n",
      "\u001b[90mThis warning is displayed once every 8 hours.\u001b[39m”\n",
      "LIANA: LR summary stats calculated!\n",
      "\n",
      "Now Running: Cellphonedb\n",
      "\n",
      "Warning message:\n",
      "“\u001b[1m\u001b[22m`progress_estimated()` was deprecated in dplyr 1.0.0.\n",
      "\u001b[36mℹ\u001b[39m The deprecated feature was likely used in the \u001b[34mliana\u001b[39m package.\n",
      "  Please report the issue at \u001b[3m\u001b[34m<https://github.com/saezlab/liana/issues>\u001b[39m\u001b[23m.”\n",
      "Now Running: Natmi\n",
      "\n",
      "Now Running: Sca\n",
      "\n"
     ]
    }
   ],
   "source": [
    "liana_res_partial <- liana_wrap(sce = sdata, \n",
    "                                idents_col='cell.type', \n",
    "                                assay.type = 'logcounts', \n",
    "                                verbose=T, \n",
    "                                method = c(\"cellphonedb\", \"natmi\", 'sca'), \n",
    "                                parallelize = par, workers = n.cores)\n",
    "liana_aggregate_partial <- liana_aggregate(liana_res = liana_res_partial, \n",
    "                                           aggregate_how='magnitude', verbose = F)\n",
    "liana_aggregate_partial<-liana_aggregate_partial[c('source', 'target', 'ligand.complex', 'receptor.complex', 'aggregate_rank')]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "92787f77",
   "metadata": {
    "vscode": {
     "languageId": "r"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"dataframe\">\n",
       "<caption>A tibble: 6 × 5</caption>\n",
       "<thead>\n",
       "\t<tr><th scope=col>source</th><th scope=col>target</th><th scope=col>ligand.complex</th><th scope=col>receptor.complex</th><th scope=col>aggregate_rank</th></tr>\n",
       "\t<tr><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "\t<tr><td>B</td><td>B</td><td>ACTR2 </td><td>ADRB2 </td><td>0.44463670</td></tr>\n",
       "\t<tr><td>B</td><td>B</td><td>ADAM17</td><td>ITGB1 </td><td>0.08916285</td></tr>\n",
       "\t<tr><td>B</td><td>B</td><td>ADAM17</td><td>RHBDF2</td><td>0.36320725</td></tr>\n",
       "\t<tr><td>B</td><td>B</td><td>ADAM28</td><td>ITGA4 </td><td>0.43629682</td></tr>\n",
       "\t<tr><td>B</td><td>B</td><td>APOC2 </td><td>LRP1  </td><td>1.00000000</td></tr>\n",
       "\t<tr><td>B</td><td>B</td><td>APOE  </td><td>LRP1  </td><td>0.72715808</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "A tibble: 6 × 5\n",
       "\\begin{tabular}{lllll}\n",
       " source & target & ligand.complex & receptor.complex & aggregate\\_rank\\\\\n",
       " <chr> & <chr> & <chr> & <chr> & <dbl>\\\\\n",
       "\\hline\n",
       "\t B & B & ACTR2  & ADRB2  & 0.44463670\\\\\n",
       "\t B & B & ADAM17 & ITGB1  & 0.08916285\\\\\n",
       "\t B & B & ADAM17 & RHBDF2 & 0.36320725\\\\\n",
       "\t B & B & ADAM28 & ITGA4  & 0.43629682\\\\\n",
       "\t B & B & APOC2  & LRP1   & 1.00000000\\\\\n",
       "\t B & B & APOE   & LRP1   & 0.72715808\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "A tibble: 6 × 5\n",
       "\n",
       "| source &lt;chr&gt; | target &lt;chr&gt; | ligand.complex &lt;chr&gt; | receptor.complex &lt;chr&gt; | aggregate_rank &lt;dbl&gt; |\n",
       "|---|---|---|---|---|\n",
       "| B | B | ACTR2  | ADRB2  | 0.44463670 |\n",
       "| B | B | ADAM17 | ITGB1  | 0.08916285 |\n",
       "| B | B | ADAM17 | RHBDF2 | 0.36320725 |\n",
       "| B | B | ADAM28 | ITGA4  | 0.43629682 |\n",
       "| B | B | APOC2  | LRP1   | 1.00000000 |\n",
       "| B | B | APOE   | LRP1   | 0.72715808 |\n",
       "\n"
      ],
      "text/plain": [
       "  source target ligand.complex receptor.complex aggregate_rank\n",
       "1 B      B      ACTR2          ADRB2            0.44463670    \n",
       "2 B      B      ADAM17         ITGB1            0.08916285    \n",
       "3 B      B      ADAM17         RHBDF2           0.36320725    \n",
       "4 B      B      ADAM28         ITGA4            0.43629682    \n",
       "5 B      B      APOC2          LRP1             1.00000000    \n",
       "6 B      B      APOE           LRP1             0.72715808    "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "liana_aggregate_partial <- plyr::arrange(liana_aggregate_partial, source, target, ligand.complex, receptor.complex)\n",
    "write.csv(liana_aggregate_partial, paste0(output_folder, 'magnitude_ranks_R.csv'))\n",
    "head(liana_aggregate_partial)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f24417dd",
   "metadata": {},
   "source": [
    "Note, to run the correlation, make sure to have run the [companion Python tutorial](../ccc_python/S3_Score_Consistency.ipynb) up to the point where you save the csv named \"magnitude_ranks_python.csv\". "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "3e49008a",
   "metadata": {
    "vscode": {
     "languageId": "r"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[1m\u001b[22mJoining with `by = join_by(source, target, ligand.complex, receptor.complex)`\n",
      "Warning message in cor.test.default(la$aggregate_rank, la$magnitude_rank, method = c(\"spearman\")):\n",
      "“Cannot compute exact p-value with ties”\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"The spearman correlation bewteen R and python aggregate magnitude scores is: 0.98\"\n"
     ]
    }
   ],
   "source": [
    "# read and format python aggregate rank\n",
    "lap_python<-read.csv(paste0(output_folder, 'magnitude_ranks_python.csv'))\n",
    "clp<-colnames(lap_python)\n",
    "clp[4]<-'ligand.complex'\n",
    "clp[5]<-'receptor.complex'\n",
    "colnames(lap_python)<-clp\n",
    "lap_python<-lap_python[colnames(lap_python) != 'X']\n",
    "\n",
    "# merge the two scores\n",
    "la<-dplyr::inner_join(liana_aggregate_partial, lap_python)\n",
    "\n",
    "spearmanr<-cor.test(la$aggregate_rank, \n",
    "         la$magnitude_rank, method = c('spearman'))$estimate[['rho']]\n",
    "print(paste0('The spearman correlation bewteen R and python aggregate magnitude scores is: ', \n",
    "             format(round(spearmanr, 2), nsmall = 2)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66a727ce",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R [conda env:ccc_protocols]",
   "language": "R",
   "name": "conda-env-ccc_protocols-r"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "4.2.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}