% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rem_stat_persistence.R
\name{computePersistence}
\alias{computePersistence}
\title{Compute Butts' (2008) Persistence Network Statistic for Event Dyads in a Relational Event Sequence}
\usage{
computePersistence(
  observed_time,
  observed_sender,
  observed_receiver,
  processed_time,
  processed_sender,
  processed_receiver,
  sender = TRUE,
  dependency = FALSE,
  relationalTimeSpan = NULL,
  nopastEvents = NA,
  sliding_windows = FALSE,
  processed_seqIDs = NULL,
  window_size = NA
)
}
\arguments{
\item{observed_time}{The vector of event times from the pre-processing event sequence.}

\item{observed_sender}{The vector of event senders from the pre-processing event sequence.}

\item{observed_receiver}{The vector of event receivers from the pre-processing event sequence}

\item{processed_time}{The vector of event times from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).}

\item{processed_sender}{The vector of event senders from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).}

\item{processed_receiver}{The vector of event receivers from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).}

\item{sender}{TRUE/FALSE. TRUE indicates that the persistence statistic will be computed in reference to the sender’s past relational history (see details section). FALSE indicates that the persistence statistic will be computed in reference to the target’s past relational history (see details section). Set to TRUE by default.}

\item{dependency}{TRUE/FALSE. TRUE indicates that temporal relevancy will be modeled (see the details section). FALSE indicates that temporal relevancy will not be modeled, that is, all past events are relevant (see the details section). Set to FALSE by default.}

\item{relationalTimeSpan}{If dependency = TRUE, a numerical value that corresponds to the temporal span for relational relevancy, which must be the same measurement unit as the observed_time and processed_time objects. When dependency = TRUE, the relevant events are events that have occurred between current event time, \emph{t}, and \emph{t-relationalTimeSpan}. For example, if the time measurement is the number of days since the first event and the value for relationalTimeSpan is set to 10, then only those events which occurred in the past 10 days are included in the computation of the statistic.}

\item{nopastEvents}{The numerical value that specifies what value should be given to events in which the sender has sent not past ties (i's neighborhood when sender = TRUE) or has not received any past ties (j's neighborhood when sender = FALSE). Set to NA by default.}

\item{sliding_windows}{TRUE/FALSE. TRUE indicates that the sliding windows computational approach will
be used to compute the network statistic, while FALSE indicates the ap- proach will not be used. Set
to FALSE by default. It’s important to note that the sliding windows framework should only be used
when the pre-processed event sequence is ‘big’, such as the 360 million pre-processed event sequence
used in Lerner and Lomi (2020), as it aims to reduce the computational burden of sorting ‘big’ datasets. In general,
most pre-processed event sequences will not need to use the sliding windows
approach. There is not a strict cutoff for ‘big’ dataset. This definition depends on both the
size of the observed event sequence and the post-processing sampling dataset. For instance,
according to our internal tests, when the event sequence is relatively large (i.e., 100,000
observed events) with probability of sampling from the observed event sequence set to 0.05
and using 10 controls per sampled event, the sliding windows framework for computing repetition
is about 11\% faster than the non-sliding windows framework. Yet, in a smaller dataset
(i.e., 10,000 observed events) the sliding windows framework is about 25\% slower than the
non-sliding framework with the same conditions as before.}

\item{processed_seqIDs}{If sliding_windows is set to TRUE, the vector of event sequence IDs from the post-processing event sequence. The event sequence IDs represents the index for when the event occurred in the observed event sequence (e.g., the 5th event in the sequence will have a value of 5 in this vector).}

\item{window_size}{If sliding_windows is set to TRUE, the sizes of the windows that are used for the sliding windows computational framework. If NA, the function internally divides the dataset into ten slices (may not be optimal).}
}
\value{
The vector of persistence network statistics for the relational event sequence.
}
\description{
This function computes the persistence network sufficient statistic for
a relational event sequence (see Butts 2008). Persistence measures the proportion of past ties sent from the event sender that went to the current event receiver.
Furthermore, this measure allows for persistence scores to be only
computed for the sampled events, while creating the weights based on the full event
sequence. Moreover, the function allows users to specify relational relevancy for the statistic and
employ a sliding windows framework for large relational sequences.
}
\details{
The function calculates the persistence network sufficient statistic for a relational event sequence based on Butts (2008).

The formula for persistence for event \eqn{e_i} with reference to the sender's past relational history is:
\deqn{Persistence_{e_{i}} = \frac{d(s(e_{i}),r(e_{i}), A_t)}{d(s(e_{i}), A_t)} }

where  \eqn{d(s(e_{i}),r(e_{i}), A_t)} is the number of past events where the current event sender sent a tie to the current event receiver, and \eqn{d(s(e_{i}), A_t)} is the number of past events where the current sender sent a tie.

The formula for persistence for event \eqn{e_i} with reference to the target's past relational history is:
\deqn{Persistence_{e_{i}} = \frac{d(s(e_{i}),r(e_{i}), A_t)}{d(r(e_{i}), A_t)} }

where  \eqn{d(s(e_{i}),r(e_{i}), A_t)} is the number of past events where the current event sender sent a tie to the current event receiver, and \eqn{d(r(e_{i}), A_t)} is the number of past events where the current receiver recieved a tie.

Moreover, researchers interested in modeling temporal relevancy (see Quintane, Mood, Dunn, and Falzone 2022) can specify the relational time span, that is, length of time for which events are considered
relationally relevant. This should be specified via the option \emph{relationalTimeSpan} with \emph{dependency} set to TRUE.
}
\examples{


# A Dummy One-Mode Event Dataset
events <- data.frame(time = 1:18,
                                eventID = 1:18,
                                sender = c("A", "B", "C",
                                           "A", "D", "E",
                                           "F", "B", "A",
                                           "F", "D", "B",
                                           "G", "B", "D",
                                           "H", "A", "D"),
                                target = c("B", "C", "D",
                                           "E", "A", "F",
                                           "D", "A", "C",
                                           "G", "B", "C",
                                           "H", "J", "A",
                                           "F", "C", "B"))

# Creating the Post-Processing Event Dataset with Null Events
eventSet <- processOMEventSeq(data = events,
                          time = events$time,
                          eventID = events$eventID,
                          sender = events$sender,
                          receiver = events$target,
                          p_samplingobserved = 1.00,
                          n_controls = 6,
                          seed = 9999)

#Compute Persistence with respect to the sender's past relational history without
#the sliding windows framework and no temporal dependency
eventSet$persist <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = TRUE,
                                        nopastEvents = 0)

#Compute Persistence with respect to the sender's past relational history with
#the sliding windows framework and no temporal dependency
eventSet$persistSW <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = TRUE,
                                        sliding_windows = TRUE,
                                        processed_seqIDs = eventSet$sequenceID,
                                        nopastEvents = 0)

#The results with and without the sliding windows are the same (see correlation
#below). Using the sliding windows method is recommended when the data are
#big' so that memory allotment is more efficient.
cor(eventSet$persist,eventSet$persistSW)


#Compute Persistence with respect to the sender's past relational history without
#the sliding windows framework and temporal dependency
eventSet$persistDep <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = TRUE,
                                        dependency = TRUE,
                                        relationalTimeSpan = 5, #the past 5 events
                                        nopastEvents = 0)

#Compute Persistence with respect to the receiver's past relational history without
#the sliding windows framework and no temporal dependency
eventSet$persistT <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = FALSE,
                                        nopastEvents = 0)

#Compute Persistence with respect to the receiver's past relational history with
#the sliding windows framework and no temporal dependency
eventSet$persistSWT <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = FALSE,
                                        sliding_windows = TRUE,
                                        processed_seqIDs = eventSet$sequenceID,
                                        nopastEvents = 0)

#The results with and without the sliding windows are the same (see correlation
#below). Using the sliding windows method is recommended when the data are
#big' so that memory allotment is more efficient.
cor(eventSet$persistT,eventSet$persistSWT)


#Compute Persistence with respect to the receiver's past relational history without
#the sliding windows framework and temporal dependency
eventSet$persistDepT <- computePersistence(observed_time = events$time,
                                        observed_receiver = events$target,
                                        observed_sender = events$sender,
                                        processed_time = eventSet$time,
                                        processed_receiver = eventSet$receiver,
                                        processed_sender = eventSet$sender,
                                        sender = FALSE,
                                        dependency = TRUE,
                                        relationalTimeSpan = 5, #the past 5 events
                                        nopastEvents = 0)

}
\references{
Butts, Carter T. 2008. "A relational event framework for social action." \emph{Sociological Methodology} 38(1): 155-200.

Quintane, Eric, Martin Wood, John Dunn, and Lucia Falzon. 2022. “Temporal
Brokering: A Measure of Brokerage as a Behavioral Process.” \emph{Organizational Research Methods}
25(3): 459-489.
}
\author{
Kevin A. Carson \href{mailto:kacarson@arizona.edu}{kacarson@arizona.edu}, Diego F. Leal \href{mailto:dflc@arizona.edu}{dflc@arizona.edu}
}
