%We thank the reviewers for their detailed and insightful comments.

%

%This new submission is an extensive revision,

%which addresses the comments of the reviewers and

%provides significant improvements over the previous version.

%Before discussing the comments of the reviewers point-by-point,

%we provide a brief summary of the main changes.

%

%\begin{itemize}

%\item

%We have analyzed the convergence of the main iterative algorithm (Algorithm 1):

%we have argued that the algorithm terminates

%and shows that it requires $\mathcal{O}(nr)$ steps to terminate, in the worst case.

%Our experiments show that this upper bound is pessimistic.

%In practice the algorithm terminates in a much smaller number of steps,

%usually less than~20.

%

%\item

%We present an improved version of the fractional-programming algorithm {\sc fast-ga} (Algorithm 4).

%Up to logarithmic factors,

%the running time of the improved version of {\sc fast-ga} is $\mathcal{O}(m)$,

%where $m$ is the number of interactions in the network.

%The same technique is also used to improve the binary search approach described in Section IV.C.

%Up to logarithmic factors, the running time of the improved version of {\sc fast-ba} is $\mathcal{O}(Km)$,

%where $m$ is the number of interactions in the network, and $K$ is the number of intervals.

%

%In practice the improved versions give up to several orders of magnitude improvement.

%

%\item

%We provide worst-case running-time analysis for both fast algorithms, {\sc fast-ba} and {\sc fast-ga}.

%Those are presented in the end of sections IV-C and and IV-D, respectively.

%

%\item

%We have modified our notation

%and we now use the term ``concave'' instead of ``submodular''.

%This is partly justified by the confusion that the term ``submodular'' caused to the reviewers,

%and partly by the fact that some authors define the property that we are interested as ``concave''.

%

%As the functions we work with are functions over intervals of reals, rather than sets of elements,

%we think that the term ``concave'' is more accurate

%and makes our presentation more clear.

%For a detailed discussion on the terminology,

%and references to other works that use functions with these properties,

%see the revisited manuscript, section IV-C.

%

%\item

%We have tested our methods with larger datasets

%of networks with up to 1 million interactions.

%Our results, presented in Figures 1 and 2,

%demonstrate that the fast version of our algorithms,

%in particular the improved version of the {\sc fast-ga} algorithm,

%scale very well with the input size.

%

%\item

%We have improved the presentation of our proofs,

%by adding intermediate steps in the derivations,

%making the explanations more clear,

%and breaking them down using auxiliary lemmas.

%

%\item

%We have improved the presentation, added discussion, and strengthened the motivation in the introduction.

%\end{itemize}

%

%

%The responses to the individual remarks are now given below.

\section*{Reviewer 1}

\reviewer{The problem motivation, definition, meanings are vague. In Section 1, it would be considerably helpful if the example is expanded with more necessary information. For example, what is a main event/secondary event in the example of tweets? What is (can be) the reconstructed result of the example?}

\answer{expand the example}

\reviewer{As a more general comment, it would be helpful if a concrete temporal network (V, E, and timestamps) is shown for a given data setting like the tweet example. Currently, it is hard for the reviewer to build the connection between the real life data and the formal definitions that follows.}

\answer{expand the example}

\reviewer{Although it may make sense to reduce the vertex-covering problem (VC) to the network-untangling problems (NU), the reviewer got puzzled by how the time interval lengths in NU correspond to the number of vertices VC is questionable and vague in the current presentation. The current proof in 3.1 could be improved with more technical details (e.g., intermediate steps and perhaps examples).

a) In the if direction, how the 2n edges lead to a total span not larger than l?

b) In the other direction, the relation between n and l is not clear as well.}

\answer{expand proof 3.1}

\reviewer{There exists a possible naive algorithm as follows. Find the vertex with the highest degree, construct an interval (or k intervals) for it, delete it and all its edges. Repeat these steps for the remaining graph(s) until all vertices are deleted. It is interesting to know if this works and/or how it works in terms of efficiency and result quality.}

\answer{do we want to imlement that? highest temporal degree or static? }

\reviewer{From the beginning paragraphs and the example in Introduction, the problem seems to be of IR or ML nature as time intervals need to be decided for events for which, however, limited information is known. It is therefore elusive why the problems are formulated as they are now, which are more or less search or optimization without any consideration on other information related to events. Justifications are needed for this gap. In addition, it should be analyzed why ground truth could still be approached although event characteristics are essentially ignored in the problem formulation and designed algorithms. This seems to me surprising.}

\answer{this I do not understand, probably should be addressed in the intro or related work}

\reviewer{Section 6 should be more specific on the similarities and differences between this work and those on burst and event detection. The contrast should be more on the methodologies or technical solutions than on the problem settings.}

\answer{related work}

\reviewer{Scalability needs explicit experimental results in Section 7.3.}

\answer{should we do this? now we report only typical running times for large datasets.}

\reviewer{As a suggestion related to 1 and 2 above, the Introduction would be reader-friendly to refer to the results shown in Fig. 7 and 8, explaining what a temporal network looks like, what the reconstructed timelines are, and how good the reconstructed results are.}

\answer{expand intro by experimental results}

\reviewer{In Introduction, it would also help if a short paragraph is used to tell which sections in this paper are new compared to the conference version. }

\answer{There is no clear division, we extended each chapter of our conference submission by considering $k$-intervals case. }

%\reviewer{..it is not so clear how the problems are related to (esp. differ from) previous studies such as burst detection and event detection.}

%\note{}

%\answer{}

\section*{Reviewer 2}

\reviewer{Readability: The paper requires substantial effort to read and understand. This is true not only for the algorithmic part of the methods, but even for clearly understanding the problem of interest itself. Its readability would greatly benefit by a number of illustrations that aim to better communicate the problem, but also the main intuition of the algorithmic approach to solve it.}

\answer{intro, motivation, examples}

\reviewer{Definition: I found unintuitive the definition of a "covered" edge, where an edge is covered if at least one of $u$ or $v$ are active at time $t$.}

\answer{we can relate to cover problem}

\reviewer{Limitations: What are the limitations (if any) of the algorithm(s). When they are expected to not work well? Any discussion about limitations can help the reader to further appreciate the contribution its potential applicability in realcase scenarios. For example, it appears as if the algorithm(s) assume that the whole temporal network is available as well as activity interval information (provided as input to the algorithm). This is fine for a historical analysis of the timelines. However, what would it mean to have a streaming version of these algorithms? How easy it would be to apply them in a streaming setting? How this would affect the theoretical results? It would be nice if authors can (briefly) discuss the streaming case scenario and other potential limitations (and challenges) of their approach.}

\answer{discuss this in the conclusion?}

\reviewer{Abstract/Conclusion: Abstract and conclusions fail to accurately describe the problem of interest. In fact, the reader needs to reach to the end of the problem statement to understand the actual problem. It's understandable that this is in part because the problem is new and because of its nature (where optimization functions need to be presented), however, I argue the authors to try to find a more intuitive way to describe the problem early on using plain language before the formal problem is introduced. Unfortunately, the motivating example of Brexit was not making it easier to understand, as it allows for multiple interpretations of the problem of interest. IMHO, this probably relates to the unintuitive definition of a "covered" edge as mentioned above (but it might just be me). As a workaround the author may want to move the sentence of "...a new problem for summarizing temporal networks..." from the Introduction, to the abstract; that helped.}

\answer{change plain language problem formulation}

\reviewer{Language: A few grammatical errors and typos in the manuscript (see below). p3: the vertices V consists of $\to$ the set of vertices V consists of OR the vertices V consist of p3: we also given a set $\to$ we are also given a set p4: iterations are needed, We $\to$ iterations are needed. We p9: "Extending our definition ... for future work." $\to$ This sentence should be removed/revised, as this version is the extension. p12/13: \#nokiaemg $\to$\#nokiaegm (multiple occurences throughout) p14: startap $\to$ startup}

\answer{Fixed.}

%\reviewer{Some grammatical errors exist and the language could be improved.}

%\reviewer{The paper is well-organized, so it is easy to follow, but substantial effort is requirent to fully understand the algorithmic details.}

@@ -28,7 +28,7 @@ In the \prbvertex problem we are asked to decide whether there exists a subset $

of at most $\ell$ vertices ($|U|\le\ell$) covering all edges in $A$.

We map an instance of \prbvertex to an instance of \prbsum by creating a temporal network $G =(V, E)$, as follows.

The vertices$V$ consists of $2n$ vertices:

The vertex set$V$ consists of $2n$ vertices:

for each $w_i \in W$, we add vertex $v_i$ and $u_i$.

The edges are as follows:

For each edge $(w_i, w_j)\in A$, we add a temporal edge $(v_i, v_j, 0)$ to $E$.

...

...

@@ -66,7 +66,7 @@ intractrable, but we cannot even approximate them.

\begin{proof}

To prove the result we provide a reduction from \prbvertex. Assume

that we are given a graph $G$ with $n$ vertices and an integer $k$.

We construct a temporal network $H$ as follows: We place $G$ at time stamp $0$.

We construct a temporal network $H$ as follows. We place $G$ at time stamp $0$.

We then add $k$ fully-connected graphs $C_1, \ldots, C_k$ with $n$ vertices at time stamps $1, \ldots, k$.

We claim that $H$ has $k$-interval cover with \emph{zero} cost if and only if

...

...

@@ -102,7 +102,7 @@ on the fact that we are dealing with edges and not hyper-edges.

Luckily, we can consider meaningful subproblems.

Assume that we are given a temporal network

$G =(V, E)$ and we also given a set of time points $\set{m_v}_{v \in V}$,

$G =(V, E)$ and we are also given a set of time points $\set{m_v}_{v \in V}$,

i.e., one time point $m_v$ for each vertex $v\in V$,

and we are asked whether we can find an optimal activity timeline $\tl=\set{\aint{u}}_{u\in V}$

so that the interval $\aint{v}$ of vertex $v$ contains the corresponding time point $m_v$,

...

...

@@ -217,9 +217,9 @@ Otherwise, we compute $m(i)$ to be the median of $W(i)$, ignoring any empty $W(i

and we test the median

of all $m(i)$ (weighted by $\abs{W(i)}$) as a new budget.

We can show that at each iteration $\sum\abs{W(i)}$ is reduced

by $1/4$, that is, only $\bigO{\log m}$ iterations are needed,

We can determine the medians $m(i)$ and the sizes $\abs{W(i)}$

in linear time since $T$ is sorted, and we can determine the weighted median in linear time by using a modified median-of-medians algorithm. This leads to running time

by $1/4$, that is, only $\bigO{\log m}$ iterations are needed.

We can determine the medians $m(i)$ and the sizes $\abs{W(i)}$

in linear time since $T$ is sorted, and we can determine the weighted median in linear time by using a modified median-of-medians algorithm. This leads to running time

yielding an $\bigO{m \log m}$ algorithm.

In our experiments

we use a straightforward binary search by testing $(U + L)/2$ as a budget.

@@ -4,13 +4,13 @@ In Figure~\ref{fig:nov2013} we show a subset of hashtags from tweets posted in N

We also depict the activity intervals for those hashtags, as discovered by algorithm \alginterior.

Note that for not cluttering the image, we depict only a subset of all relevant hashtags.

In particular, we pick 3 seed hashtags: {\tt\#slush13}, {\tt\#mtvema} and

{\tt\#nokiaemg} and the set of hashtags that co-occur with the seeds.

{\tt\#nokiaegm} and the set of hashtags that co-occur with the seeds.

Each of the seeds corresponds to a known event:

{\tt\#slush13} corresponds to Slush'13 --

the world's leading startup and tech event, organized

in Helsinki in November 13-14, 2013.

{\tt\#mtvema} is dedicated to MTV Europe Music Awards, held on 10 November, 2013.

{\tt\#nokiaemg} is Extraordinary General Meeting (EGM) of Nokia Corporation, held

{\tt\#nokiaegm} is Extraordinary General Meeting (EGM) of Nokia Corporation, held

in Helsinki in November 19, 2013.

For each hashtag we depict its activity intervals in blue. All hashtag's mentions are shown as small circles on the timeline. A circle is colored blue, if it falls into hashtag's discovered activity interval, and orange otherwise. We draw arched edges for interactions (co-occurrences) of two hashtags only if at the moment of interaction one hashtag is active and another one is not.

...

...

@@ -19,7 +19,7 @@ Figure~\ref{fig:nov2013} shows that the tag {\tt \#slush13} becomes active exact

at the starting date of the event. During its activity this tag covers many technical

\#younited} (personal cloud service by local company) and {\tt\#walkbase}

(local software company). Then on 19 November, the tag {\tt\#nokiaemg} becomes active:

(local software company). Then on 19 November, the tag {\tt\#nokiaegm} becomes active:

this event is very narrow and covers mentions of Microsoft executive Stephen

Elop. Another large event is occurring around 10 November with active tags {\tt

\#emazing}, {\tt\#ema2013} and {\tt\#mtvema}. They cover {\tt\#bestpop},

...

...

@@ -34,8 +34,8 @@ Elop. Another large event is occurring around 10 November with active tags {\tt

\vspace*{-5mm}

\caption{Part of the output of \alginterior algorithm on Twitter dataset

for November'13. Tags, co-occurring with

%hashtags {\tt \#slush13}, {\tt \#mtvema} and {\tt \#nokiaemg}. Activity intervals and active moments of interactions are colored blue, inactive moments of interactions are colored orange. Only edges between an active and inactive hashtags are shown.}

hashtags {\tt\#slush13}, {\tt\#mtvema} and {\tt\#nokiaemg}. Activity intervals and active moments of interactions (hashtags' co-occurrences) are colored blue, inactive moments of interactions are colored orange. Only edges between an active and inactive hashtags are shown.}

%hashtags {\tt \#slush13}, {\tt \#mtvema} and {\tt \#nokiaegm}. Activity intervals and active moments of interactions are colored blue, inactive moments of interactions are colored orange. Only edges between an active and inactive hashtags are shown.}

hashtags {\tt\#slush13}, {\tt\#mtvema} and {\tt\#nokiaegm}. Activity intervals and active moments of interactions (hashtags' co-occurrences) are colored blue, inactive moments of interactions are colored orange. Only edges between an active and inactive hashtags are shown.}

\label{fig:nov2013}

\end{figure*}

...

...

@@ -51,4 +51,4 @@ Elop. Another large event is occurring around 10 November with active tags {\tt

Many events have recurrent nature. For example, Slush conference is an annual event. In Figure~\ref{fig:slush} we show a subset of hashtags from tweets posted from January 2011 till December 2013. We run \algkinterior with $k=3$ and depict the activity intervals for some hashtags co-occuring with {\tt\#slush}.

Figure~\ref{fig:slush} shows that, although each year has its own tag for Slush ({\tt\#slush11},{\tt\#slush12}, {\tt\#slush13} and variants), tag {\tt\#slush} becomes active every November, when the event takes place. As before, it covers many company names and tech-related hashtags (e.g., {\tt\#supercell}, {\tt\#sailfish}, {\tt\#jolla}, {\tt\#aller}). On the other hand, even though we set $k=3$, hashtags, which are always active (such as startap-related hashtags {\tt\#startupsauna} and {\tt\#aaltoes}), have large activity intervals, which span the whole timeline.

\ No newline at end of file

Figure~\ref{fig:slush} shows that, although each year has its own tag for Slush ({\tt\#slush11},{\tt\#slush12}, {\tt\#slush13} and variants), tag {\tt\#slush} becomes active every November, when the event takes place. As before, it covers many company names and tech-related hashtags (e.g., {\tt\#supercell}, {\tt\#sailfish}, {\tt\#jolla}, {\tt\#aller}). On the other hand, even though we set $k=3$, hashtags, which are always active (such as startup-related hashtags {\tt\#startupsauna} and {\tt\#aaltoes}), have large activity intervals, which span the whole timeline.